ACL | Cong's Log

CorefQA - Coreference resolution as query-based span prediction

2020, ACL data: CoNLL-2012, GAP task: Coreference Resolution 通过QA方式处理coreference问题，A query is generated for each candidate mention using its surrounding con- text, and a span prediction module is em- ployed to extract the text spans of the corefer- ences within the document using the generated query. 近期的方法有consider all text spans in a document as potential mentions and learn to find an antecedent for each possible mention. There。这种仅依靠mention的做对比的方法的缺点： At the task formalization level：因为当前数据集有很多遗漏的mention， mentions left out at the mention proposal stage can never be recov- ered since the downstream module only operates on the proposed mentions. At the algorithm level：Semantic matching operations be- tween two mentions (and their contexts) are per- formed only at the output layer and are relatively superficial 方法 Speaker information： directly concatenates the speaker’s name with the corresponding utterance. ...

Early Rumour Detection

2019, ACL data: TWITTER, WEIBO links: https://www.aclweb.org/anthology/N19-1163, https://github.com/DeepBrainAI/ERD task: Rumour Detection 这篇文章采用GRU编码社交媒体posts stream，作为环境的状态表示；训练一个分类器以GRU的状态输出为输入，对文本做二分类判断是否是rumor。用DQN训练agent，根据状态做出是否启动rumor分类器进行判断，并根据分类结果对错给予奖惩。目标就是尽可能准尽可能早地预测出社交媒体posts是否是rumor。 Focuses on the task of rumour detection; particularly, we are in- terested in understanding how early we can detect them. Our model treats social media posts (e.g. tweets) as a data stream and integrates reinforcement learning to learn the number minimum num- ber of posts required before we classify an event as a rumour. Let $E$ denote an event, and it consists of a series of relevant posts $x_i$, where $x_0$ denotes the source message and $x_T$ the last relevant message. The objective of early rumor detection is to make a classification decision whether E is a rumour as early as possible while keeping an acceptable detection accuracy. ...

Matching the Blanks - Distributional Similarity for Relation Learning

2019, ACL data: KBP37, SemEval 2010 Task 8, TACRED task: Entity and Relation Extraction Build task agnostic relation representations solely from entity-linked text. 缺陷文章认为网页中, 相同的的实体对一般指代相同的实体关系, 把实体不同的构建为负样本. 这个在单份文件中可能大概率是对的. 但是实体不完全一直不代表这个两对实体的关系不同. 所以这个作为负样本是本质上映射的是实体识别而不是关系. 比较好的方式是把实体不同但是关系一样的也考虑进来. 方法 Define Relation Statement We define a relation statement to be a block of text containing two marked entities. From this, we create training data that contains relation statements in which the entities have been replaced with a special [BLANK] ...

Improving Event Detection via Open-domain Trigger Knowledge

2020, ACL data: ACE 05 task: Event Detection Propose a novel Enrichment Knowledge Distillation (EKD) model to efficiently distill external open-domain trigger knowledge to reduce the in-built biases to frequent trigger words in annotations. leverage the wealth of the open-domain trigger knowledge to improve ED propose a novel teacher-student model (EKD) that can learn from both labeled and unlabeled data 缺点只能对付普遍情况, 即一般性的触发词; 但触发词不是在任何语境下都是触发词. 方法 empower the model with external knowledge called Open-Domain Trigger Knowledge, defined as a prior that specifies which words can trigger events without subject to pre-defined event types and the domain of texts. ...

Cross-media Structured Common Space for Multimedia Event Extraction

2020, ACL Task: MultiMedia Event Extraction Introduce a new task, MultiMedia Event Extraction (M2E2), which aims to extract events and their arguments from multimedia documents. Construct the first benchmark and evaluation dataset for this task, which consists of 245 fully annotated news articles Propose a novel method, Weakly Aligned Structured Embedding (WASE), that encodes structured representations of semantic information from textual and visual data into a common embedding space. which takes advantage of annotated unimodal corpora to separately learn visual and textual event extraction, and uses an image-caption dataset to align the modalities ...

Knowledge-Graph-Embedding的Translate族（TransE，TransH，TransR，TransD）

data: WN18, WN11, FB15K, FB13, FB40K task: Knowledge Graph Embedding TransE Translating Embeddings for Modeling Multi-relational Data（2013） https://proceedings.neurips.cc/paper/2013/file/1cecc7a77928ca8133fa24680a88d2f9-Paper.pdf 这是转换模型系列的第一部作品。该模型的基本思想是使head向量和relation向量的和尽可能靠近tail向量。这里我们用L1或L2范数来衡量它们的靠近程度。损失函数 $\mathrm{L}(h, r, t)=\max \left(0, d_{\text {pos }}-d_{\text {neg }}+\text { margin }\right)$使损失函数值最小化，当这两个分数之间的差距大于margin的时候就可以了(我们会设置这个值，通常是1) 但是这个模型只能处理一对一的关系，不适合一对多/多对一关系，例如，有两个知识，(skytree, location, tokyo)和(gundam, location, tokyo)。经过训练，“sky tree”实体向量将非常接近“gundam”实体向量。但实际上它们没有这样的相似性。 with tf.name_scope("embedding"): self.ent_embeddings = tf.get_variable(name = "ent_embedding", shape = [entity_total, size], initializer = tf.contrib.layers.xavier_initializer(uniform = False)) self.rel_embeddings = tf.get_variable(name = "rel_embedding", shape = [relation_total, size], initializer = tf.contrib.layers.xavier_initializer(uniform = False)) pos_h_e = tf.nn.embedding_lookup(self.ent_embeddings, self.pos_h) pos_t_e = tf.nn.embedding_lookup(self.ent_embeddings, self.pos_t) pos_r_e = tf.nn.embedding_lookup(self.rel_embeddings, self.pos_r) neg_h_e = tf.nn.embedding_lookup(self.ent_embeddings, self.neg_h) neg_t_e = tf.nn.embedding_lookup(self.ent_embeddings, self.neg_t) neg_r_e = tf.nn.embedding_lookup(self.rel_embeddings, self.neg_r) if config.L1_flag: pos = tf.reduce_sum(abs(pos_h_e + pos_r_e - pos_t_e), 1, keep_dims = True) neg = tf.reduce_sum(abs(neg_h_e + neg_r_e - neg_t_e), 1, keep_dims = True) self.predict = pos else: pos = tf.reduce_sum((pos_h_e + pos_r_e - pos_t_e) ** 2, 1, keep_dims = True) neg = tf.reduce_sum((neg_h_e + neg_r_e - neg_t_e) ** 2, 1, keep_dims = True) self.predict = pos with tf.name_scope("output"): self.loss = tf.reduce_sum(tf.maximum(pos - neg + margin, 0)) TransH Knowledge Graph Embedding by Translating on Hyperplanes（2014） ...

Open-Domain Targeted Sentiment Analysis via Span-Based Extraction and Classification

2019, ACL data: SemEval 2014, SemEval 2014 ABSA, SemEval 2015, SemEval 2016 task: ABSA propose a span-based extract-then-classify framework, where multiple opinion targets are directly extracted from the sentence under the supervision of target span boundaries, and corresponding polarities are then classified using their span representations. 优点：用指针网络选取target，避免了序列标注的搜索空间过大问题用span边界+极性的标注方式，解决多极性的target问题方法 Input: sentence x =(x1,..., xn) with length n, Target list T = {t1,..., tm}： each target ti is annotated with its start, end position, and its sentiment polarity ...