北京交通大学论坛-知行信息交流平台

 找回密码
 注册
快速进入版块与发帖 搜索
查看: 4466|回复: 0

查准率(Precision)和查全率(recall)

[复制链接]
发表于 2016-5-29 22:03 | 显示全部楼层 |阅读模式
什么是查准率和查全率(precision and recall)
查准率(Precision)和查全率(recall)应用在信息处理领域的多个子领域。
信息检索
定义
查准率和查全率用来衡量搜索引擎的性能
       查全率=(检索出的相关信息量/系统中的相关信息总量)*100%       查准率=(检索出的相关信息量/检索出的信息总量)*100%
查全率是衡量检索系统和检索者检出相关信息的能力,查准率是衡量检索系统和检索者拒绝非相关信息的能力。
实验证明,在查全率和查准率之间存在着相反的相互依赖关系--如果提高输出的查全率,就会降低其查准率,反之亦然。

局限性
查全率的局限性主要表现在:查全率是检索出的相关信息量与存储在检索系统中的全部相关信息量之比,但系统中相关信息量究竟有多少一般是不确知的,只能估计;另外,查全率或多或少具有“假设”的局限性,这种“假设”是指检索出的相关信息对用户具有同等价值,但实际并非如此,对于用户来说,信息的相关程度在某种意义上比它的数量重要得多。
查准率的局限性主要表现在:如果检索结果是题录式而非全文式,由于题录的内容简单,用户很难判断检索到的信息是否与课题密切相关,必须找到该题录的全文,才能正确判断出该信息是否符合检索课题的需要;同时,查准率中所讲的相关信息也具有“假设”的局限性。


信息提取
查全率和查准率还可以应用到信息提取子领域,用于衡量信息提取器的性能。
查全率(Recall)是测量被正确提取的信息的比例,而查准率(Precision)用来测量提取出的信息中有多少是正确的。
计算公式如下(P是查准率,R是查全率):
       查准率 = 提取出的正确信息条数 / 提取出的信息条数       查全率 = 提取出的正确信息条数 / 样本中的信息条数
两者取值在0和1之间,数值越接近1,查全率或查准率就越高。
除此两指标以外,还有F值评价方法,是查全率和查准率的加权几何平均值:
       F = (b^2 + 1) * PR / b^2P + R
其中:b 是一个预设值,是P和R的相对权重,b大于1时表示P更重要,b小于1时表示R更重要。通常设定为1,表示二者同等重要。
这样用F一个数值就可看出系统的好坏,F值也是越接近1越好。


文本分类
在文本分类领域,查准率和查全率还可以用来衡量文本分类器的性能。例如,在观点挖掘(opinion mining)领域,衡量分类器识别出正面观点(positive opinion)的性能:
       查准率 = 识别出的真正的正面观点数 / 所有的识别为正面观点的条数       查全率 = 识别出的真正的正面观点数 / 样本中所有的真正正面观点的条数
详细解释可以参看维基百科条目
In a statistical classification task, the Precision for a class is the number of true positives (i.e. the number of items correctly labeled as belonging to the positive class) divided by the total number of elements labeled as belonging to the positive class (i.e. the sum of true positives and false positives, which are items incorrectly labeled as belonging to the class). Recall in this context is defined as the number of true positives divided by the total number of elements that actually belong to the positive class (i.e. the sum of true positives and false negatives, which are items which were not labeled as belonging to the positive class but should have been).
In a classification task, a Precision score of 1.0 for a class C means that every item labeled as belonging to class C does indeed belong to class C (but says nothing about the number of items from class C that were not labeled correctly) whereas a Recall of 1.0 means that every item from class C was labeled as belonging to class C (but says nothing about how many other items were incorrectly also labeled as belonging to class C).
在观点挖掘领域还有一个有趣的应用(参看 Bing Liu, "Sentiment Analysis and Subjectivity")
One of the bottlenecks in applying supervised learning is the manual effort involved in annotating a large number of training examples. To save the manual labeling effort, a bootstrapping approach to label training data automatically is reported in [80, 81]. The algorithm works by first using two high precision classifiers (HP-Subj and HP-Obj) to automatically identify some subjective and objective sentences. The high-precision classifiers use lists of lexical items (single words or n-grams) that are good subjectivity clues. HP-Subj classifies a sentence as subjective if it contains two or more strong subjective clues. HPObj classifies a sentence as objective if there are no strongly subjective clues. These classifiers will give very high precision but low recall. The extracted sentences are then added to the training data to learn patterns. The patterns (which form the subjectivity classifiers in the next iteration) are then used to automatically identify more subjective and objective sentences, which are then added to the training set, and the next iteration of the algorithm begins.






手机版|北京交通大学论坛-知行信息交流平台 ( BJTUICP备13011901号 )

GMT+8, 2019-12-12 15:52

Powered by Discuz! X3.4

© 2001-2013 Comsenz Inc.

快速回复 返回顶部 返回列表