Skip to content

ngram_vocab=True ignores single words in vocabulary #364

@mirorac

Description

@mirorac

When ngram_vocab=True is used, single words seem to be ignored in the vocabulary. In previous versions, this behavior did not occur, so I wanted to check if this change was intentional or an unintended regression.

Here’s the relevant line in the code:

vocab = phrases

Suggested fix:

vocab += phrases  

Could merging the phrases with the previously built vocabulary resolve the issue, or is this the expected behavior in the latest version?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions