DSpace Concept Linking Zhou

Category: Education

Presentation Description

No description available.


Presentation Transcript

DSpace Concept Linking An Innovative Feature for Virtual Olympic Museum: 

DSpace Concept Linking An Innovative Feature for Virtual Olympic Museum Baoyao Zhou, Yuhong Xiong and Wei Liu HP Labs China Xunkun Shen, Yue Qi, Jiahui Wang and Yong Hu Beihang University, China 3 April 2008


Outline Introduction Problem Statement Our Solution Concept-tree Construction Concept Matching and Linking Current Status and Next Steps


Introduction Virtual Olympic Museum (VOM) One of the major “Digital Olympics” projects. It aims to use the DSpace platform to create an archive of the Beijing 2008 Olympic Games. A collaborative project between HP Labs China and Beihang University DSpace concept linking Classification-based searching and recommendation

Problem Statement: 

Problem Statement DSpace data model A hierarchy with Communities Collections Items DSpace concept linking To discover richer semantic relationships among DSpace resources, which may not be captured by the community & collection hierarchy, and link them together across different communities and collections to further facilitate content navigations.

Problem Statement: 

Problem Statement DSpace concept All tags (names, titles, alternative names and alternative titles) of DSpace resources (communities, collections and items). The alternative name and alternative title are used to store the synonyms of the corresponding name and title. “Beijing 2008”  “the Games of the XXIX Olympiad; the 2008 Olympics”.

Our Solution: 

Our Solution System architecture (1) Query out all DSpace concepts with the handle based URLs of the corresponding DSpace resources from the DSpace database; (2) Identify and highlight DSpace concepts in DSpace textual contents, and add corresponding hyperlinks; (3) Update DSpace textual contents in the DSpace database.

Concept-tree Construction: 

Concept-tree Construction Concept-tree A Trie-tree based model to compactly store all DSpace concepts by combining their common prefix to facilitate the efficient concept matching. One virtual root node, called rootNode Some internal nodes representing words, called wordNode Some leaf nodes representing resources (handle based URLs), called resourceNode Each path from the root node to a leaf node represents a certain DSpace concept (a sequence of words) and the corresponding DSpace resource.

Concept-tree Construction: 

Concept-tree Construction Example The path “Root  Athens  2004  URL3” denotes the DSpace concept “Athens 2004” with its corresponding DSpace resource “URL3”.

Concept Matching and Linking: 

Concept Matching and Linking Main tasks Each piece of DSpace textual content  a sequence of words Match the sequence of words with the Concept-tree from the start pointer (initially set to the beginning of the word sequence); If no matching DSpace concept found, move the start pointer forward one word; If a matching DSpace concept found, highlight this concept and append hyperlinks with the URLs stored in the Concept-tree, and move the start pointer forward one concept; Repeat such process until the start pointer move to the end of the word sequence.

Concept Matching and Linking: 

Concept Matching and Linking Example Note Only highlight the longest matching DSpace concept. only highlight “Athens 2004”, but not “Athens”. Only highlight the first occurrence of the repeated DSpace concept.

Current Status and Next Steps: 

Current Status and Next Steps As one of the key projects for the Beijing 2008 Olympic Games, Virtual Olympic Museum is scheduled to formally open at the end of April 2008. Two prototypes have been deployed in both Beihang University and HP Labs China. 85 communities, 238 collections and 1,423 items 1,747 DSpace concepts 1,422 pieces of DSpace textual content are processed, around 160 seconds Future work to contribute DSpace concept linking back into the DSpace core project as a plug-in or a standalone tool.




Thanks a lot! Q & A

authorStream Live Help