EMaas - Entity Matching-as-a-Service
Entity Matching-as-a-Service (EMaaS) targets the problem of identifying records that refer to the same entity in the real world. This task is known to be challenging due to its pair-wise comparison nature, especially when the datasets involved in the matching process have a high volume (Big Data). Since the EM task has critical importance for data cleaning and integration, e.g., to find duplicate points of interest in different databases, the importance of the efforts focused on the challenges and possible solutions of how EM can benefit from modern parallel computing programming models, such as Apache Spark, has grown considerably nowadays.

Owner type: Academia/Research