public class XMLParser extends java.lang.Object implements MonolingualCorpusParser, MonolingualCorpusWriter
MonolingualCorpus
es in our custom
XML format.Constructor and Description |
---|
XMLParser()
Constructs a new parser.
|
Modifier and Type | Method and Description |
---|---|
MonolingualCorpusBuilder |
getWriterCorpusBuilder(java.io.Writer out)
Returns a wrapper
MonolingualCorpusBuilder that writes sentences
by this MonolingualCorpusWriter as they are added to it. |
void |
parseMonolingualCorpus(MonolingualCorpusBuilder builder,
java.io.Reader... in)
Builds a
MonolingualCorpus by parsing some input in our custom XML format. |
void |
parseMonolingualCorpus(MonolingualCorpusBuilder builder,
java.util.Set<java.lang.Integer> removedIndexes,
java.io.Reader... in)
Builds a
MonolingualCorpus by parsing some input in our custom
XML format and removing the sentences in the specified indexes. |
void |
writeMonolingualCorpus(MonolingualCorpus corpus,
java.io.Writer out)
Writes a
MonolingualCorpus in our custom XML format. |
public void parseMonolingualCorpus(MonolingualCorpusBuilder builder, java.io.Reader... in) throws ParseException
MonolingualCorpus
by parsing some input in our custom XML format.parseMonolingualCorpus
in interface MonolingualCorpusParser
builder
- the MonolingualCorpusBuilder
with which to build the corpus.in
- the input XML(s) to read from. If more than one are given, the produced output will be the concatenation of all of them in the same order.ParseException
- if some sort of parsing error occurs.public void parseMonolingualCorpus(MonolingualCorpusBuilder builder, java.util.Set<java.lang.Integer> removedIndexes, java.io.Reader... in) throws ParseException
MonolingualCorpus
by parsing some input in our custom
XML format and removing the sentences in the specified indexes.parseMonolingualCorpus
in interface MonolingualCorpusParser
builder
- the MonolingualCorpusBuilder
with which to build the corpus.removedIndexes
- the set of indexes for the sentences to remove, starting from 1.in
- the input XML(s) to read from. If more than one are given, the produced output will be the concatenation of all of them in the same order.ParseException
- if some sort of parsing error occurs.public void writeMonolingualCorpus(MonolingualCorpus corpus, java.io.Writer out) throws ParseException
MonolingualCorpus
in our custom XML format.writeMonolingualCorpus
in interface MonolingualCorpusWriter
corpus
- the MonolingualCorpus
to write.out
- the Writer
to write the output to.ParseException
- if some sort of writing error occurs.public MonolingualCorpusBuilder getWriterCorpusBuilder(java.io.Writer out) throws ParseException
MonolingualCorpusWriter
MonolingualCorpusBuilder
that writes sentences
by this MonolingualCorpusWriter
as they are added to it.getWriterCorpusBuilder
in interface MonolingualCorpusWriter
out
- the Writer
to write the output to.MonolingualCorpusBuilder
that writes sentences by this MonolingualCorpusWriter
as they are added to it.ParseException
- if some sort of writing error occurs.