2014年2月28日星期五
Notes Week 8
Week 8
MIR Ch10 User Interface and Visualization
Human computer
interaction
1)
Principle
·
Offer
informative feedback is especially important for information access interfaces.
·
Reduce
working memory load. Information access is an iterative process, the goals of
which shift and change as information is encountered. One key way information
access interfaces can help with memory load is to provide mechanisms for
keeping track of choices made during the search process.
·
Provide
alternative interfaces for novice and expert users. An important tradeoff in
all user interface design is that of simplicity vs power.
2)
Role of visualization
·
Human are highly attuned to images and visual
information. pics can be captivating and appealing, especially if well
designed.
·
The growing prevalence of fast pics processors
and high resolution color monitors is increasing interest in information
visualization.
·
Visualization of inherently abstract information
is more difficult, and visualization of textually represented information is
especially challenging.
3)
Evaluating
·
An important aspect of HCI is the methodology
for evaluation of user interface techniques. Precision and recall measures have
been widely used for comparing the ranking results of non-interactive systems,
but are less appropriate for assessing interactive systems.
·
Empirical data involving human users is time
consuming to gather and difficult to draw conclusion from.
The information access process
1)
Model
of interaction
· Start with an information need
· select a system and collections to search on
· formulate a query
· send the query to system
· receive the results in the form of
information items
· scan, evaluate, and interpret the results
· either stop, or,
· reformulate the query and repeat step 4
2)
Earlier
interface studiesThe bulk of the literature on studies of HCI seeking behavior concerns
information intermediaries using online systems consisting of bibliographic
records.
2014年2月21日星期五
Notes Week 7 and NO MUDDIST FOR THIS WEEK
Week 7
NO MUDDIST FOR WEEK 6
NO MUDDIST FOR WEEK 6
IIR
CH9 Relevance feedback and query expansion
This chapter discuss ways in which a
system can help with query refinement, either fully automatically or with the
user in the loop.
v Relevance feedback and
pseudo relevance feedback
Ø The idea of relevance
feedback (RF) is to involve the user in the retrieval process so as to improve
the final result set. In particular, the user gives feedback on the relevance
of documents in an initial set of results. The basic procedure is:
§ The user issues a (short,
simple) query.
§ The system returns an
initial set of retrieval results.
§ The user marks some
returned documents as relevant or nonrelevant.
§ The system computes a
better representation of the information need based on the user feedback.
§ The system displays a
revised set of retrieval results.
Ø
Image search provides a good example of relevance feedback. Not only is
it easy to see the results at work, but this is a domain where a user can
easily have difficulty formulating what they want in words, but can easily
indicate relevant or nonrelevant images.
Ø
The Rocchio algorithm
§ The Rocchio Algorithm is
the classic algorithm for implementing relevance feedback. It models a way of
incorporating relevance feedback information into the vector space model .
§ Relevance feedback can
improve both recall and precision. But, in practice, it has been shown to be
most useful for increasing recall in situations where recall is important. This
is partly because the technique expands the query.
Ø
Probabilistic relevance feedback
§ if a user has told us some
relevant and nonrelevant documents, then we can proceed to build a classifier.
One way of doing this is with a Naive Bayes probabilistic model. If R is a
Boolean indicator variable expressing the relevance of a document, then we can
estimate P(xt = 1|R), the probability of a term t appearing in a document,
depending on whether it is relevant or not,
Pˆ(xt = 1|R = 1) = |VRt|/|VR|
Pˆ(xt =1|R=0) =
(dft−|VRt|)/(N−|VR|)
Ø When does relevance
feedback work?
§ Firstly, the user has to
have sufficient knowledge to be able to make an initial query which is at least
somewhere close to the documents they desire.
§ Secondly, the relevance
feedback approach requires relevant documents to be similar to each other.
Ø RF is rarely used in the
web search.
Ø Evaluation in RF
strategies
§ Interactive relevance
feedback can give very substantial gains in retrieval performance. Empirically,
one round of relevance feedback is often very useful. Two rounds is sometimes
marginally more useful.
§ There is some subtlety to
evaluating the effectiveness of relevance feed- back in a sound and
enlightening way. The obvious first strategy is to start with an initial query
q0 and to compute a precision-recall graph. A second idea is to use documents
in the residual collection (the set of documents minus those assessed relevant)
for the second round of evaluation.
Ø Pseudo RF
Pseudo RF provides a
method for automatic local analysis. It automates the manual part of relevance feedback,
so that the user gets improved retrieval performance with- out an extended
interaction. The method is to do normal retrieval to find an initial set of
most relevant documents, to then assume that the top k ranked documents are
relevant, and finally to do relevance feedback as before under this assumption.
v Relevance feedback has
been shown to be very effective at improving relevance of results. Its
successful use requires queries for which the set of relevant documents is
medium to large. Full relevance feedback is often onerous for the user, and its
implementation is not very efficient in most IR systems. In many cases, other
types of interactive retrieval may improve relevance by about as much with less
work.
Beyond the core ad hoc retrieval scenario, other
uses of relevance feedback include:
Ø Following a changing
information need (e.g., names of car models of interest change over time)
Ø Maintaining an information
filter (e.g., for a news feed). Such filters are discussed further.
Ø Active learning (deciding
which examples it is most useful to know the class of to reduce annotation
costs).
v Global methods for query
reformulation
Ø This section briefly
talked about three global method for expanding a query: by simply aiding the
user in doing so; by using a manual thesaurus, and through building a thesaurus
automatically.
§ Vocabulary tools
Various user supports in the search process can
help the user see how their searches are or are not working. This includes
information about words that were omitted from the query because they were on
stop lists, what words were stemmed to, the number of hits on each term or
phrase, and whether words were dynamically turned into phrases.
§ Query expansion
In relevance feedback, users give additional input
on documents (by marking documents in the results set as relevant or not), and
this input is used to reweight the terms in the query for documents. In query
expansion on the other hand, users give additional input on query words or
phrases, possibly suggesting additional query terms.
§ Automatic thesaurus generation
As an alternative to the cost of a manual
thesaurus, we could attempt to generate a thesaurus automatically by analyzing
a collection of documents. There are two main approaches. One is simply to
exploit word co-occurrence. The other approach is to use a shallow grammatical
analysis of the text and to exploit grammatical relations or grammatical
dependencies.
2014年2月14日星期五
Muddist Week 5
It's a question from this week's reading.
Evaluation of unranked sets
I don't understand why we use a harmonic mean rather than the simpler average (arithmetic mean). The book said: "When the values of two numbers differ greatly, the harmonic mean is closer to their minimum than to their arithmetic mean." But I can't get why we want the mean be as closer as to the minimum.
Evaluation of unranked sets
I don't understand why we use a harmonic mean rather than the simpler average (arithmetic mean). The book said: "When the values of two numbers differ greatly, the harmonic mean is closer to their minimum than to their arithmetic mean." But I can't get why we want the mean be as closer as to the minimum.
Notes Week 6
Week 6
CH8. EVALUATION IN IR
· IR system evaluation
1.
Relevant
or nonrelevant
a.
Relevance
is assessed relative to an information need, not a query.
b.
Relevance
can reasonably be thought of as a scale, with some documents highly relevant
and others marginally so.
· Standard test collections (particularly for ad
hoc IR system)
1.
CRANFIELD
The
pioneering test collection in allowing precise quantitative measures of
information retrieval effectiveness, but too small for now.
2.
Text
Retrieval Conference (TREC) from 1992
3.
GOV2
4.
...
· Evaluation of unranked retrieval sets
The
two most frequent and basic measures for information retrieval effectiveness
are precision and recall.
1.
Precision
is the fraction of retrieved documents that are relevant.
Precision = P(relevant|retrieved)
2.
Recall
is the fraction of relevant documents that are retrieved
|
|
|
Relevant
|
|
Nonrelevant
|
|
Retrieved
|
|
true positives (tp)
|
|
false positives (fp)
|
|
|
|
|
|
|
|
Not retrieved
|
|
false negatives (fn)
|
|
true negatives (tn)
|
Recall=P(retrieved|relevant)
P = tp/(tp+fp) and R = tp/(tp+fn)
3.
The
measures of precision and recall concentrate the evaluation on the return of
true positives, asking what percentage of the relevant documents have been
found and how many false positives have also been returned.
4.
In
general we want to get some amount of recall while tolerating only a certain
percentage of false positives. A single measure that trades off precision
versus recall is the F measure, which is the weighted harmonic mean of
precision and recall.
5.
Q: Why
do we use a harmonic mean rather than the simpler average (arithmetic mean)?
P157
·
Evaluation
of ranked retrieval results
The
interpolated precision pinterp at a certain recall level r
is defined as the highest precision found for any recall level r′ ≥
r.
· Assessing relevance
One
clear problem with the relevance-based assessment that we have presented is the
distinction between relevance and marginal relevance: whether a document still
has distinctive usefulness after the user has looked at certain other documents.
· System quality and user utility
1.
System
issues
2.
User
utility
3.
Refining
a deployed system
·
Results
snippets
1.
In many
cases the user will not want to examine all the returned documents and so we
want to make the results list informative enough that the user can do a final
ranking of the documents for themselves based on relevance to their information
need.
2.
The two
basic kinds of summaries are static, which are always the same regardless of
the query, and dynamic (or query-dependent), which are customized according to
the user’s information need as deduced from a query.
3.
A static
summary is generally comprised of either or both a subset of the document and
metadata associated with the document.
4.
Dynamic
summaries display one or more “windows” on the document, aiming to present the
pieces that have the most utility to the user in evaluating the document with
respect to their information need.
5.
Given a
variety of keyword occurrences in a document, the goal is to choose fragments
which are: (i) maximally informative about the discussion of those terms in the
document, (ii) self-contained enough to be easy to read, and (iii) short enough
to fit within the normally strict constraints on the space available for
summaries.
订阅:
评论 (Atom)