IS2140 Information Retrievial: 二月 2014

2014年2月28日星期五

Muddist Week 7

Why positive feedback is more useful than negative feedback to an IR system?

Notes Week 8

Week 8

MIR Ch10 User Interface and Visualization

Human computer interaction

1) Principle

· Offer informative feedback is especially important for information access interfaces.

· Reduce working memory load. Information access is an iterative process, the goals of which shift and change as information is encountered. One key way information access interfaces can help with memory load is to provide mechanisms for keeping track of choices made during the search process.

· Provide alternative interfaces for novice and expert users. An important tradeoff in all user interface design is that of simplicity vs power.

2) Role of visualization

· Human are highly attuned to images and visual information. pics can be captivating and appealing, especially if well designed.

· The growing prevalence of fast pics processors and high resolution color monitors is increasing interest in information visualization.

· Visualization of inherently abstract information is more difficult, and visualization of textually represented information is especially challenging.

3) Evaluating

· An important aspect of HCI is the methodology for evaluation of user interface techniques. Precision and recall measures have been widely used for comparing the ranking results of non-interactive systems, but are less appropriate for assessing interactive systems.

· Empirical data involving human users is time consuming to gather and difficult to draw conclusion from.

The information access process

1) Model of interaction

· Start with an information need

· select a system and collections to search on

· formulate a query

· send the query to system

· receive the results in the form of information items

· scan, evaluate, and interpret the results

· either stop, or,

· reformulate the query and repeat step 4

2) Earlier interface studiesThe bulk of the literature on studies of HCI seeking behavior concerns information intermediaries using online systems consisting of bibliographic records.

2014年2月21日星期五

Notes Week 7 and NO MUDDIST FOR THIS WEEK

Week 7
NO MUDDIST FOR WEEK 6

IIR

CH9 Relevance feedback and query expansion

This chapter discuss ways in which a system can help with query refinement, either fully automatically or with the user in the loop.

v Relevance feedback and pseudo relevance feedback

Ø The idea of relevance feedback (RF) is to involve the user in the retrieval process so as to improve the final result set. In particular, the user gives feedback on the relevance of documents in an initial set of results. The basic procedure is:

§ The user issues a (short, simple) query.

§ The system returns an initial set of retrieval results.

§ The user marks some returned documents as relevant or nonrelevant.

§ The system computes a better representation of the information need based on the user feedback.

§ The system displays a revised set of retrieval results.

Ø Image search provides a good example of relevance feedback. Not only is it easy to see the results at work, but this is a domain where a user can easily have difficulty formulating what they want in words, but can easily indicate relevant or nonrelevant images.

Ø The Rocchio algorithm

§ The Rocchio Algorithm is the classic algorithm for implementing relevance feedback. It models a way of incorporating relevance feedback information into the vector space model .

§ Relevance feedback can improve both recall and precision. But, in practice, it has been shown to be most useful for increasing recall in situations where recall is important. This is partly because the technique expands the query.

Ø Probabilistic relevance feedback

§ if a user has told us some relevant and nonrelevant documents, then we can proceed to build a classifier. One way of doing this is with a Naive Bayes probabilistic model. If R is a Boolean indicator variable expressing the relevance of a document, then we can estimate P(xt = 1|R), the probability of a term t appearing in a document, depending on whether it is relevant or not,

Pˆ(xt = 1|R = 1) = |VRt|/|VR| Pˆ(xt =1|R=0) = (dft−|VRt|)/(N−|VR|)

Ø When does relevance feedback work?

§ Firstly, the user has to have sufficient knowledge to be able to make an initial query which is at least somewhere close to the documents they desire.

§ Secondly, the relevance feedback approach requires relevant documents to be similar to each other.

Ø RF is rarely used in the web search.

Ø Evaluation in RF strategies

§ Interactive relevance feedback can give very substantial gains in retrieval performance. Empirically, one round of relevance feedback is often very useful. Two rounds is sometimes marginally more useful.

§ There is some subtlety to evaluating the effectiveness of relevance feed- back in a sound and enlightening way. The obvious first strategy is to start with an initial query q0 and to compute a precision-recall graph. A second idea is to use documents in the residual collection (the set of documents minus those assessed relevant) for the second round of evaluation.

Ø Pseudo RF

Pseudo RF provides a method for automatic local analysis. It automates the manual part of relevance feedback, so that the user gets improved retrieval performance with- out an extended interaction. The method is to do normal retrieval to find an initial set of most relevant documents, to then assume that the top k ranked documents are relevant, and finally to do relevance feedback as before under this assumption.

v Relevance feedback has been shown to be very effective at improving relevance of results. Its successful use requires queries for which the set of relevant documents is medium to large. Full relevance feedback is often onerous for the user, and its implementation is not very efficient in most IR systems. In many cases, other types of interactive retrieval may improve relevance by about as much with less work.

Beyond the core ad hoc retrieval scenario, other uses of relevance feedback include:

Ø Following a changing information need (e.g., names of car models of interest change over time)

Ø Maintaining an information filter (e.g., for a news feed). Such filters are discussed further.

Ø Active learning (deciding which examples it is most useful to know the class of to reduce annotation costs).

v Global methods for query reformulation

Ø This section briefly talked about three global method for expanding a query: by simply aiding the user in doing so; by using a manual thesaurus, and through building a thesaurus automatically.

§ Vocabulary tools

Various user supports in the search process can help the user see how their searches are or are not working. This includes information about words that were omitted from the query because they were on stop lists, what words were stemmed to, the number of hits on each term or phrase, and whether words were dynamically turned into phrases.

§ Query expansion

In relevance feedback, users give additional input on documents (by marking documents in the results set as relevant or not), and this input is used to reweight the terms in the query for documents. In query expansion on the other hand, users give additional input on query words or phrases, possibly suggesting additional query terms.

§ Automatic thesaurus generation

As an alternative to the cost of a manual thesaurus, we could attempt to generate a thesaurus automatically by analyzing a collection of documents. There are two main approaches. One is simply to exploit word co-occurrence. The other approach is to use a shallow grammatical analysis of the text and to exploit grammatical relations or grammatical dependencies.

2014年2月14日星期五

Muddist Week 5

It's a question from this week's reading.

Evaluation of unranked sets

I don't understand why we use a harmonic mean rather than the simpler average (arithmetic mean). The book said: "When the values of two numbers differ greatly, the harmonic mean is closer to their minimum than to their arithmetic mean." But I can't get why we want the mean be as closer as to the minimum.

Notes Week 6

Week 6

CH8. EVALUATION IN IR

· IR system evaluation

1. Relevant or nonrelevant

a. Relevance is assessed relative to an information need, not a query.

b. Relevance can reasonably be thought of as a scale, with some documents highly relevant and others marginally so.

· Standard test collections (particularly for ad hoc IR system)

1. CRANFIELD

The pioneering test collection in allowing precise quantitative measures of information retrieval effectiveness, but too small for now.

2. Text Retrieval Conference (TREC) from 1992

3. GOV2

4. ...

· Evaluation of unranked retrieval sets

The two most frequent and basic measures for information retrieval effectiveness are precision and recall.

1. Precision is the fraction of retrieved documents that are relevant.

Precision = P(relevant|retrieved)

2. Recall is the fraction of relevant documents that are retrieved

	Relevant	Nonrelevant
Retrieved	true positives (tp)	false positives (fp)

Not retrieved	false negatives (fn)	true negatives (tn)

Recall=P(retrieved|relevant)

P = tp/(tp+fp) and R = tp/(tp+fn)

3. The measures of precision and recall concentrate the evaluation on the return of true positives, asking what percentage of the relevant documents have been found and how many false positives have also been returned.

4. In general we want to get some amount of recall while tolerating only a certain percentage of false positives. A single measure that trades off precision versus recall is the F measure, which is the weighted harmonic mean of precision and recall.

5. Q: Why do we use a harmonic mean rather than the simpler average (arithmetic mean)? P157

· Evaluation of ranked retrieval results

The interpolated precision pinterp at a certain recall level r is defined as the highest precision found for any recall level r′ ≥ r.

· Assessing relevance

One clear problem with the relevance-based assessment that we have presented is the distinction between relevance and marginal relevance: whether a document still has distinctive usefulness after the user has looked at certain other documents.

· System quality and user utility

1. System issues

2. User utility

3. Refining a deployed system

· Results snippets

1. In many cases the user will not want to examine all the returned documents and so we want to make the results list informative enough that the user can do a final ranking of the documents for themselves based on relevance to their information need.

2. The two basic kinds of summaries are static, which are always the same regardless of the query, and dynamic (or query-dependent), which are customized according to the user’s information need as deduced from a query.

3. A static summary is generally comprised of either or both a subset of the document and metadata associated with the document.

4. Dynamic summaries display one or more “windows” on the document, aiming to present the pieces that have the most utility to the user in evaluating the document with respect to their information need.

5. Given a variety of keyword occurrences in a document, the goal is to choose fragments which are: (i) maximally informative about the discussion of those terms in the document, (ii) self-contained enough to be easy to read, and (iii) short enough to fit within the normally strict constraints on the space available for summaries.