public class ConcurrentAnalyzerParser extends java.lang.Object implements MonolingualCorpusParser
MonolingualCorpusParser
that is equivalent to an
AnalyzerParser
but performs its work concurrently, making use of the
specified number of threads.Constructor and Description |
---|
ConcurrentAnalyzerParser(Analyzer analyzer)
Constructs a new parser with the specified
Analyzer , assuming
that the input text will have a single sentence per line and which will
make use of as many threads as this machine can run in parallel. |
ConcurrentAnalyzerParser(Analyzer analyzer,
boolean sentencePerLine)
Constructs a new parser with the specified
Analyzer that will
make use of as many threads as this machine can run in parallel. |
ConcurrentAnalyzerParser(Analyzer analyzer,
boolean sentencePerLine,
int nThreads)
Constructs a new parser with the specified
Analyzer . |
ConcurrentAnalyzerParser(Analyzer analyzer,
boolean sentencePerLine,
int nThreads,
int queueSize)
Constructs a new parser with the specified
Analyzer . |
Modifier and Type | Method and Description |
---|---|
void |
parseMonolingualCorpus(MonolingualCorpusBuilder builder,
java.io.Reader... in)
Builds a
MonolingualCorpus by parsing some input. |
void |
parseMonolingualCorpus(MonolingualCorpusBuilder builder,
java.util.Set<java.lang.Integer> removedIndexes,
java.io.Reader... in)
Builds a
MonolingualCorpus by parsing some input. |
public ConcurrentAnalyzerParser(Analyzer analyzer)
Analyzer
, assuming
that the input text will have a single sentence per line and which will
make use of as many threads as this machine can run in parallel.analyzer
- the Analyzer
that has to be applied to the input text in order to build a MonolingualCorpus
.public ConcurrentAnalyzerParser(Analyzer analyzer, boolean sentencePerLine)
Analyzer
that will
make use of as many threads as this machine can run in parallel.analyzer
- the Analyzer
that has to be applied to the input text in order to build a MonolingualCorpus
.sentencePerLine
- whether the input text will have a single sentence per line or not. If false
, the analyzer will try to split each line into sentences by itself, but it might still decide that it consists of a single one.public ConcurrentAnalyzerParser(Analyzer analyzer, boolean sentencePerLine, int nThreads)
Analyzer
.analyzer
- the Analyzer
that has to be applied to the input text in order to build a MonolingualCorpus
.sentencePerLine
- whether the input text will have a single sentence per line or not. If false
, the analyzer will try to split each line into sentences by itself, but it might still decide that it consists of a single one.nThreads
- the number of threads that the parser should make use of.public ConcurrentAnalyzerParser(Analyzer analyzer, boolean sentencePerLine, int nThreads, int queueSize)
Analyzer
.analyzer
- the Analyzer
that has to be applied to the input text in order to build a MonolingualCorpus
.sentencePerLine
- whether the input text will have a single sentence per line or not. If false
, the analyzer will try to split each line into sentences by itself, but it might still decide that it consists of a single one.nThreads
- the number of threads that the parser should make use of.queueSize
- the maximum amount of elements to keep in the output queue before ceasing to consume more input until some space is made.public void parseMonolingualCorpus(MonolingualCorpusBuilder builder, java.io.Reader... in) throws ParseException
MonolingualCorpus
by parsing some input.parseMonolingualCorpus
in interface MonolingualCorpusParser
builder
- the MonolingualCorpusBuilder
with which to build the corpus.in
- the input plain text(s) to read from. If more than one are given, the produced output will be the concatenation of all of them in the same order.ParseException
- if some sort of parsing error occurs.public void parseMonolingualCorpus(MonolingualCorpusBuilder builder, java.util.Set<java.lang.Integer> removedIndexes, java.io.Reader... in) throws ParseException
MonolingualCorpus
by parsing some input.parseMonolingualCorpus
in interface MonolingualCorpusParser
builder
- the MonolingualCorpusBuilder
with which to build the corpus.removedIndexes
- the set of indexes for the sentences to remove, starting from 1.in
- the input plain text(s) to read from. If more than one are given, the produced output will be the concatenation of all of them in the same order.ParseException
- if some sort of parsing error occurs.