Spacy ValueError: [T001] Max length currently 10 for phrase matching

Multi tool use
Multi tool use


Spacy ValueError: [T001] Max length currently 10 for phrase matching



Is there a way to change the max length of the phrase matching to a much bigger number?
Even though the max length is set to a higher threshold.
I could not find a practical solution on prev. threads.
The labeled terms are over 9000.


import spacy
from spacy.matcher import PhraseMatcher
from spacy.tokens import Span

class EntityMatcher(object):
name = 'entity_matcher'

def __init__(self, nlp, terms, label):
patterns = [nlp(text) for text in terms]
self.matcher = PhraseMatcher(nlp.vocab, max_length=20000)
self.matcher.add(label, None, *patterns)

def __call__(self, doc):
matches = self.matcher(doc)
for match_id, start, end in matches:
span = Span(doc, start, end, label=match_id)
doc.ents = list(doc.ents) + [span]
return doc


nlp = spacy.blank('en')
text = open('/Users/Desktop/9000drugs.txt').read()
drugs = text.splitlines()
print (drugs)

entity_matcher = EntityMatcher(nlp, drugs, 'DRUG')

nlp.add_pipe(entity_matcher)
text = open('/Users/Desktop/extract_text.txt', 'r').read() # open a document
doc = nlp(text)


ents = [( e.start_char, e.end_char, e.label_) for e in doc.ents]

a ='{}'.format(doc.text)
annotation = (a, {'entities': ents})

print(annotation)









By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

hwQ8LlS9GcnwpgI,tcfkNDae,1,fBlAdYx37 gudEPYQ Wzcn,c
cB2c q7qfXSt 8hU2GtzC3 XxiGK T60BCI,dg3W DiO4jxZXGiqkhS2 6s vcEbl

Popular posts from this blog

PHP contact form sending but not receiving emails

Do graphics cards have individual ID by which single devices can be distinguished?

Create weekly swift ios local notifications