Precision
Precision is one among other measures of the performance of an information system. It was used for the first time in the so-called "Cranfield II experiments" (Cleverdon et al., 1966). Precision is defined as the proportion of relevant documents in a retrieved set compared to the total retrieved set in a given search:
|
Precision = a : (a + b) X 100%, where a = number of retrieved, relevant documents, b = number of retrieved, non-relevant documents (often termed "noise"). |
Precision is thus an a measure of the amount of noise in document-retrieval.
"Precision" constitutes together with recall the most common
measures (and concepts) for the measurements of the performance of retrieval
systems. In older literature precision is sometimes termed "relevance".
Precision should not, however, be mixed up with the concept of
relevance. The last word
represents a concept which is a precondition for the measurement of precision
(as well as recall).
In information retrieval exists a long range of strategies for the increase of
precision. Among those are the limiting of the search to specific fields (titles,
descriptors and identifiers are often used when searching in abstracts or
full-texts implies too low precision), using more well-defined terms and more
restricted use of "or" in Boolean logic (perhaps more use of logical "and"). Such strategies are, however seldom without costs but have a tendency to
lower recall.
Literature:
Cleverdon, C. W.; Mills, J. & Keen, E. M. (1966). Factors determining the performance of indexing systems. Vol. 1-2. Cranfield, U.K.: College of Aeronautics.
Birger Hjørland
Last edited: 12-02-2007