Skip to content

Latest commit

 

History

History
203 lines (168 loc) · 7.92 KB

README.md

File metadata and controls

203 lines (168 loc) · 7.92 KB

Public-Eye

Build Status

A lot of named entity disambiguation services, like dpedia spotlight, are now available on the web. They all expose a solid REST api and they all disambiguate on top of DBpedia resources. They have different output formats, though, and this is where Public-Eye comes in handy. Public-Eye is a tiny open source library that aims to harmonize the different annotation results and gives you access to language detection automatically thanks to the awesome languagedetect library.

// minimalistic example with spotlight

var publicEye     = require('public-eye')();
var text = 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45). Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89. Following German reunification in 1990, the city regained its status as the capital of Germany, hosting 147 foreign embassies.';

publicEye.spotlight({
  text: text
}, (err, response) => {
  // ... response.Resources gives you a list of
  // {
  //    ...
  //    Resources: [
  //	  { 
  //        "@URI": "http://dbpedia.org/resource/German_reunification",
  //        "@support": "1989",
  //        "@types": "",
  //        "@surfaceForm": "German reunification",
  //        "@offset": "449",
  //        "@similarityScore": "0.9999997861474641",
  //        "@percentageOfSecondRank": "1.5374345655254399E-7"
  //      }
  //    ]
  //  }
});

This tiny library gives you easy access to a number of named entity disambiguation services, including dpedia spotlight, Babelfy and Textrazor. We have just added a basic service for local stanfordNER via the ner node library. The public-eye "mapping service" translates each proprietary format to a common format; this means you can annotate text using multiple services.

Installation

npm install public-eye --save

simple examples and configuration hints

More example are provided in the /test folder. The very first thing to do is to require the library and correctly set apikeys provided by the different services:

var publicEye   = require('public-eye')({
    services: {
      textrazor: {
        apiKey: 'your-api-key'
      },
      babelfy: {
        key: 'your-babelfy-api-key'
      }
    }
  });

Once the library is available to your script, the easiset way to uniform and harmonize different services on the same text is using the series method:

publicEye.series({
      services:[
        'textrazor',
        'babelfy'
      ],
      text: 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45)'
    }, function(err, response){
    	// ... response.entities 
    	
    })

The usage type for textrazor entity disambiguation:

  
  var publicEye   = require('public-eye')({
    services: {
      textrazor: {
        apiKey: 'your-api-key'
      }
    }
  });
  
  // ..

  publicEye.textrazor({
    text: 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45). Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89. Following German reunification in 1990, the city regained its status as the capital of Germany, hosting 147 foreign embassies.'
  }, function(err, response){
    // ...
    // your callback here, response.entities is the list of entities with startingPos and endingPos
  })

Usage type for babelfy:

  
  var publicEye   = require('public-eye')({
    services: {
      babelfy: {
        key: 'your-api-key'
      }
    }
  });
  
  // ..

  publicEye.babelfy({
    text: 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45). Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89. Following German reunification in 1990, the city regained its status as the capital of Germany, hosting 147 foreign embassies.'
  }, function(err, response){
    // ...
    // your callback here, response is the list of entities with startingPos and endingPos
  })

Usage type for StanfordNER, cfr. ner node library documentation:

  var publicEye   = require('public-eye')({
    services: {
      stanfordNER: {
        port: 9191,
        host: 'localhost'
      }
    }
  });
  
  publicEye.stanfordNER({
    text: 'First documented in the 13th century, Berlin was the capital of the Kingdom of Prussia (1701–1918), the German Empire (1871–1918), the Weimar Republic (1919–33) and the Third Reich (1933–45). Berlin in the 1920s was the third largest municipality in the world. After World War II, the city became divided into East Berlin -- the capital of East Germany -- and West Berlin, a West German exclave surrounded by the Berlin Wall from 1961–89. Following German reunification in 1990, the city regained its status as the capital of Germany, hosting 147 foreign embassies.'
    }, function(err, body){
      console.log(res.entities);
      // => { LOCATION: 
      //      [ 'Berlin',
      //        'Prussia',
      //        'Weimar',
      //        'Berlin',
      //        'East Berlin',
      //        'East Germany',
      //        'West Berlin',
      //        'Berlin',
      //        'Germany' ],
      //      ORGANIZATION: [],
      //      DATE: 
      //      [ '13th century',
      //        '1918',
      //        '1918',
      //        '1919',
      //        '1933',
      //        '1920s',
      //        '1961',
      //        '1990' ],
      //      MONEY: [],
      //      PERSON: [ 'Reich' ],
      //      PERCENT: [],
      //      TIME: [] }

    });

Usage type for geonames search:

  var publicEye   = require('public-eye')({
    services: {
      geonames: {
        username: 'your-username'
      }
    }
  });

  publicEye.geonames({
    text: 'Osh' // a city in Kyrgyzstan
  }, function(err, body){
      // console.log(body.geonames)
      // => [ 
      //      { adminCode1: '08',
      //        lng: '72.7985',
      //        geonameId: 1527534,
      //        toponymName: 'Osh',
      //        countryId: '1527747',
      //        fcl: 'P',
      //        population: 200000,
      //        countryCode: 'KG',
      //        name: 'Osh',
      //        fclName: 'city, village,...',
      //        countryName: 'Kyrgyzstan',
      //        fcodeName: 'seat of a first-order administrative division',
      //        adminName1: 'Osh',
      //        lat: '40.52828',
      //        fcode: 'PPLA' 
      //       },
      //       ...
      //     ]
  });

More services to come, stay tuned!