from_text¶
-
ucca.convert.
from_text
(text, passage_id='1', tokenized=False, one_per_line=False, extra_format=None, lang='en', return_text=False, *args, **kwargs)[source]¶ Converts from tokenized strings to a Passage object.
Parameters: - text – a multi-line string or a sequence of strings: each line will be a new paragraph, and blank lines separate passages
- passage_id – prefix of ID to set for returned passages
- tokenized – whether the text is already given as a list of tokens
- one_per_line – each line will be a new passage rather than just a new paragraph
- extra_format – value to set in passage.extra[“format”]
- lang – language to use for tokenization model
- return_text – whether to return the original text with each passage and not just the passage itself
Returns: generator of Passage object with only Terminal units