Accommodations deduplication
View/ Open
Date
2018-09-25Author
Pérez Sena, Francis Damián
Metadata
Show full item recordAbstract
The problem to address is the accommodations deduplication. The deduplication is
a special case of entity resolution (ER) consisting in grouping different representa-
tions of the same entity, usually coming from different sources. The deduplication is
a complex process that requires several phases, being the most common ones, block-
ing and pair resolution. A new phase is introduced in addition to the previous ones,
clustering, that was not considered in previous work. We aim to build a framework
able to cover the different phases and design a strategy of clustering maximizing the
precision with the maximal possible recall.