Inproceedings3503
Efficient Graph-based Document Similarity
Efficient Graph-based Document Similarity
Published: 2016
Juni
Herausgeber: Harald Sack, Eva Blomqvist, Mathieu d'Aquin, Chiara Ghidini, Simone Paolo Ponzetto, Christoph Lange
Buchtitel: The Semantic Web. Latest Advances and New Domains.
Verlag: Springer-Verlag
Organisation: 13th Extended Semantic Web Conference (ESWC)
Referierte Veröffentlichung
BibTeX
Kurzfassung
Assessing the relatedness of documents is at the core of many
applications such as document retrieval and recommendation. Most similarity approaches operate on word-distribution based document representations - fast to compute, but problematic when documents differ in language, vocabulary or type and neglecting the rich relational knowledge available in Knowledge Graphs. In contrast, graph-based document models can leverage valuable knowledge about relations between entities - however, due to expensive graph operations, similarity assessments tend to become in-feasible in many applications. This paper presents an efficient semantic similarity approach exploiting explicit hierarchical and traversal relations. We show in our experiments that (i) our similarity measure provides a significantly higher correlation with human notions
of document similarity than comparable measures, (ii) this also holds for short documents with few annotations, (iii) document similarity can be calculated efficiently compared to other graph-traversal based approaches.
Download: Media:ESWC-2016.pdf
Web Science und Wissensmanagement