Skip to content

why the keyword phrase include a PRON, like "it" #271

@chencjiajy

Description

@chencjiajy

I have run the following code snippet, the output including word "it", pos_kept don't include the PRON.

import spacy
import pytextrank

nlp = spacy.load("en_core_web_sm")
# add PyTextRank to the spaCy pipeline
nlp.add_pipe("textrank", config={'pos_kept': ["NOUN", "PROPN", "VERB"]})

text = '''The MCU SDK for WRG1 general firmware has been launched, and it can be automatically generated after creating the product.'''
doc = nlp(text)

for phrase in doc._.phrases[:10]:
    print(phrase.text, phrase.rank, phrase.count, phrase.chunks)

## the output is 
# the product 0.12286712485174818 1 [the product]
# WRG1 general firmware 0.10712303413227088 1 [WRG1 general firmware]
# The MCU SDK 0.0834726982382997 1 [The MCU SDK]
# it 0.0 1 [it]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions