Coreference resolution in python nltk using Stanford coreNLP

Stanford CoreNLP provides coreference resolution as mentioned here, also this thread, this, provides some insights about its implementation in Java.

However, I am using python and NLTK and I am not sure how can I use Coreference resolution functionality of CoreNLP in my python code. I have been able to set up StanfordParser in NLTK, this is my code so far.

from nltk.parse.stanford import StanfordDependencyParser
stanford_parser_dir = 'stanford-parser/'
eng_model_path = stanford_parser_dir  + "stanford-parser-models/edu/stanford/nlp/models/lexparser/englishRNN.ser.gz"
my_path_to_models_jar = stanford_parser_dir  + "stanford-parser-3.5.2-models.jar"
my_path_to_jar = stanford_parser_dir  + "stanford-parser.jar"

How can I use coreference resolution of CoreNLP in python?


As mentioned by @Igor You can try the python wrapper implemented in this GitHub repo:

This repo contains two main files:

Perform the following changes to get coreNLP working:

  1. In the, change the path of the corenlp folder. Set the path where your local machine contains the corenlp folder and add the path in line 144 of

    if not corenlp_path:
    corenlp_path = <path to the corenlp file>

  2. The jar file version number in “” is different. Set it according to the corenlp version that you have. Change it at line 135 of

    jars = ["stanford-corenlp-3.4.1.jar",

In this replace 3.4.1 with the jar version which you have downloaded.

  1. Run the command:


This will start a server

  1. Now run the main client program


This provides a dictionary and you can access the coref using ‘coref’ as the key:

For example: John is a Computer Scientist. He likes coding.

     "coref": [[[["a Computer Scientist", 0, 4, 2, 5], ["John", 0, 0, 0, 1]], [["He", 1, 0, 0, 1], ["John", 0, 0, 0, 1]]]]

I have tried this on Ubuntu 16.04. Use java version 7 or 8.

