Beyond Ranking: Focus on Content and Query Understanding

dtunkelang 0 views 21 slides Oct 10, 2025
Slide 1
Slide 1 of 21
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21

About This Presentation

This talk, given as a teaser for Doug Turnbull and Trey Grainger's class on AI-powered search, makes the case for prioritizing content understanding and query understanding in your search applications to ensure relevance and thus enable better ranking. It also explains ways to use the bag-of-doc...


Slide Content

Beyond Ranking: Focus on
Content and Query Understanding
Daniel Tunkelang
High-Class Consultant
queryunderstanding.com

tl;dr
Ranking optimizes order of relevant results – at best.
Content and query understanding ensures retrieval of
relevant results.

Let’s make the case for ranking.
The searcher has an information need.
Each result provides some utility.
Sort all results by expected utility.

Sounds reasonable, right?

earbuds under $30



The problem here isn’t
the ranking of results.


It’s the failure to
understand the query
and the retrieval of
irrelevant results.

Problems with ranking.
Similar results are equally relevant. van Rijsbergen, 1979
Ranking ≠ Relevance ≠ Utility. Turpin and Scholer, 2006
Utility of results is not additive.
Non-relevant results = negative utility.

A better approach.
Content understanding:
Establish a robust representation of the content.
Query understanding:
Establish a robust representation of the query.
Align these to retrieve relevant content, then rank.

Understanding the
query means
retrieving the right,
relevant results.

Ranking still matters,
but it is secondary to
relevance.

Better yet, it can
assume that results
are relevant!

Reductionist and Holistic Approaches
Reductionist: break the problem into parts.
Holistic: solve the problem as a whole.


Complementary approaches – so use both!

Step 1: Content Understanding
Reductionist approach:
Extract content attributes to
populate structured data.
Holistic approach:
Populate vectors so that cosines
reflects similarity / substitutability.

Evaluating Content Understanding
Reductionist approach:
Compute precision and recall of structured data.
Holistic approach:
Correlate vector similarity to ground truth. If you don’t
have ground truth, use an LLM to generate it.

Step 2: Query Understanding
Reductionist approach:
Extract query attributes to
populate structured data.
Holistic approach:
Bag-of-documents query vector that
aggregates relevant result vectors.

Evaluating Query Understanding
Reductionist approach:
Compute precision and recall of query attributes.
Holistic approach:
Correlate query-result and query-query similarity to
ground truth. Again, you can use an LLM to generate it.

What’s a Bag of Documents?
Straight Talk Apple iPhone 13, 128GB, Midnight -
Prepaid Smartphone [Locked to Straight Talk]

HP 14 inch Laptop Intel Core i3-N305 8GB RAM
256GB SSD Moonlight Blue

Beats Solo3 Wireless On-Ear Headphones - Gold
Search queries often don’t look like document titles.

iphone
laptops
headphones

Computing Bag-of-Documents Vector
[0.13, 0.81, …], [0.09, 0.75, …], [0.98, 0.77, …],…
[0.11, 0.79, … ]
mens black tshirts
Easy for head queries.
More work to train model
that generalizes to tail
and unseen queries.
query → documents → aggregate vector

► ►
[0.13, 0.81, … ]
[0.09, 0.75, … ]


[0.11, 0.79, … ]
[0.13, 0.81, … ]
[0.09, 0.77, … ]


[0.12, 0.78, … ]

cos = 0.98
black tshirts for men mens black t-shirt
Computing Query Similarity

Improve Recall and More!
Replace token-level expansion with holistic approach.
Identifying equivalent queries defragments intents spread
across queries in autocomplete, search suggestions, etc.
Can even relate keyword queries to browse nodes!

[0.13, 0.81, …], [0.09, 0.75, …], [0.98, 0.77, …],…
[0.11, 0.79, … ]
mens black tshirts
0.820.75 0.81
0.79
Computing Query Specificity
Broad queries have
low specificity, while
narrow queries have
high specificity.

Optimize the Search Journey!
Low query specificity can trigger interface elements that
elicit more signal from the searcher, e.g., refinements.
Autocomplete can favor high-specificity queries, which are
more likely to lead to a conversion.
High specificity makes relevance is critical; low specificity
means more room to trade off relevance for desirability.

Ranking still matters!
All results should be relevant, but not all relevant results
are equally valuable to searchers.
Ranking should reflect desirability, personalization, etc.
In fact, getting query understanding and relevance right is
what makes it possible for ranking do its job!

Summary
Ranking can only optimize after relevance is guaranteed.
Invest in content and query understanding first.
Apply reductionist and holistic methods together.

Thank You!
[email protected]
https://www.linkedin.com/in/dtunkelang/
https://dtunkelang.medium.com/
https://queryunderstanding.com/
http://contentunderstanding.com/