Products and Technologies

Find people, places, organizations, trademarks

  • Datasheet

    Nstein Text Mining Engine: Best-of-breed enterprise semantic analysis software.

    Download datasheet

Identify people, locations, organizations, trademarks/products and more, and make content richer and more dynamic

Overview

Nstein's entity extractor, known as Nfinder, locates and extracts places, people, organizations, trademarks/products and just about anything else you can think of. This is a powerful way to sort, visualize and organize data. Everything contains data, and everything with a name can conceivably be identified. Nfinder helps extract the 5 W's from content.

Applications

Inline Tagging: Inline tags are hyperlinks embedded directly in Web content highlighting important aspects of that content, allowing visitors to explore it further. Identifying key entities simplifies the process of creating appropriate inline tags.

Linking Open Data: The term Linked Data was created by Tim-Berners Lee and refers to a style of publishing and interlinking structured data on the Web. The basic assumption behind Linked Data is that the value and usefulness of data increases the more it is interlinked with other data. Nfinder facilitates linked data between multiple sources by exposing URI references to Freebase, Wikipedia, etc.

Geo-tagging: As more and more content is created and delivered via mobile channels, geo-tagging is becoming an increasingly lucrative business opportunity. Being able to identify geographic entities or the locations of other entities in order to feed these locations into any number of map-based APIs can considerably augment content's value.

Real-time indexes: An index can be defined as "something that serves to guide, point out, or otherwise facilitate reference". Nfinder's entities in particular, and other TME metadata in general, can be used to generate real-time indexes that can serve as value-added references after or instead of an initial full text search as presented below.

How it works

Nfinder uses authority files (controlled vocabularies) and linguistic rules to identify and extract all occurrences of an entity type in a document. Entity types can include : product names, company names, people names, geographic locations, dates, times, currencies, and more.

In order to perform entity extraction, Nfinder relies both on authority files and linguistic context:

  • Authority files are highly-structured pre-established lists of entities, organized hierarchically
  • Linguistic rules used to extract entities seek and support: context-based exceptions, language variations, syntactic and morphological exceptions, synonyms and alternative terms, abbreviations, Boolean combinations and more.

Nstein's TME Manager allows customers to edit generic and specific authority files.

Input/Output

Nfinder ingests documents in any format and outputs:

  • A list of normalized entities with attributes, hierarchical information and relevancy rankings