Entity Linking

Entity Linking Knowledge Graph (知识图谱)：一种语义网络，旨在描述客观世界的概念实体及其之间的关系，有时也称为Knowledge Base (知识库)。图谱由三元组构成：<实体1，关系，实体2> 或者 <实体，属性，属性值>；例如：<姚明，plays-in，NBA>、<姚明，身高，2.29m>；常见的KB有：Wikidata、DBpedia、YAGO。 Entity 实体：实体是知识图谱的基本单元，也是文本中承载信息的重要语言单位。 Mention 提及：自然文本中表达实体的语言片段。应用方向 Question Answering：EL是KBQA的刚需，linking到实体之后才能查询图数据库； Content Analysis：舆情分析、内容推荐、阅读增强； Information Retrieval：基于语义实体的搜索引擎，google搜索一些实体，右侧会出现wikipedia页面； Knowledge Base population：扩充知识库，更新实体和关系。候选实体和消歧 Entity linking system consists of two components: candidate entity generation：从mention出发，找到KB中所有可能的实体，组成候选实体集 (candidate entities)； Entity Disambiguation：从candidate entities中，选择最可能的实体作为预测实体。 Entity Disambiguation (ED) 是最重要的部分 Features Context-Independent Features： LinkCount：#(m->e)，知识库中某个提及m指向实体e的次数； Entity Attributes：Popularity、Type； Context-Dependent Features： Textual Context：BOW, Concept Vector Coherence Between Entities：WLM、PMI、Jaccard Distance Context-Independent Features mention到实体的LinkCount、实体自身的一些属性（比如热度、类型等等） LinkCount作为一个先验知识，在消歧时，往往很有用 Context-Dependent Features 全局地进行entities的消歧实际上是一个NP-hard的问题，因此核心问题是如何更加快速有效地利用一致性特征 ...