This proposal is aimed at the task of Open Named Entity Recognition. Named Entity Recog- nition (NER) is the task of identifying entity mentions within a given text corpus, and assigning each mention to one of a small number of high-level classes (such as PERSON, LOCATION, or ORGANIZATION). Open NER (ONER) is NER in which the entity classes are numerous and not known in advance. ONER entails extracting both the entities and the classes into which entities may fall, and providing a mapping from each entity mention to the set of classes. For example, from the sentence Our system adopts the popular collapsed Gibbs sampling approach, a typical NER system would not detect any mentions (no people, locations, or organizations are named in the sentence). An Open NER system, by contrast, should extract the entity \collapsed Gibbs sampling." Further, the system should assign the phrase to several classes extracted from elsewhere in the corpus (e.g. \Gibbs samplers," \MCMC Techniques," and so on). TBD: systems datasets metrics TBD: taxonomy induction focus on leaves, multi-class typing rather than the tree structure
