Mining Anchor Text Trends for Retrieval
Na Dai and Brian D. Davison
Department of Computer Science & Engineering, Lehigh University, USA
Abstract. Anchor text has been considered as a useful resource to complement
the representation of target pages and is broadly used in web search. However,
previous research only uses anchor text of a single snapshot to improve web
search. Historical trends of anchor text importance have not been well modeled in
anchor text weighting strategies. In this paper, we propose a novel temporal an-
chor text weighting method to incorporate the trends of anchor text creation over
time, which combines historical weights of anchor text by propagating the anchor
text weights among snapshots over the time axis. We evaluate our method on a
real-world web crawl from the Stanford WebBase. Our results demonstrate that
the proposed method can produce a significant improvement in ranking quality.
When a web page designer creates a link to another page, she will typically highlight a
portion of the text of the current page, and embed it within a reference to the target page.
This text is called the anchor text of the link, and usually forms a succinct description
of the target page so that the reader of the current page can decide whether or not to
follow the hyperlink.
Links to a target page are ostensibly created by people other than the author of
the target, and thus the anchor texts likely include summaries and alternative represen-
tations of the target page content. Because these anchor texts are typically short and
descriptive, they are potentially similar to queries  and can reflect user’s informa-
tion needs. Hence, anchor text has been widely utilized as an important part in ranking
functions in general for commercial search engines.
Anchor text can also be important to other tasks, such as query intent classification
, query refinement , query translation  and so on. Mining anchor text can
help to better understand queries an