sClippy is an application that acts as a local information hub for different scientific publication repositories. Its goals is to support early stage researchers in their initial efforts of familiarization with a particular field. sClippy lifts shallow and deep metadata captured in scientific publications and making it accessible for export and embed, as well as using it for achieving information expansion. The metadata extraction process is split between the extraction of :
The shallow metadata extraction is developed based on a low-level document engineering approach, by combining mining and analysis of the publications' text based on its formatting style and font information. On the other hand, the deep metadata extraction follows a combined impirical and linguistic approach.
sClippy currently does the following: Perform automatic extraction of shallow metadata (listed above) from publications encoded as PDF documents and formatted with the LNCS or ACM styles. Perform automatic extraction of knowledge items from the publications' content -- the main idea of exposing the deep metadata is to give the user the chance of having a quick glance over the main contributions the paper provides, without the need of her reading the entire paper. Export the extracted metadata in RDF (in a particular format) Embed the extracted metadata into the original document Perform information expansion based on the publication's title and authors, by using a particular publication repository. Provide the means for exploring the co-author space, starting from a selected author and using the same repository.
sClippy is implemented as an Eclipse application, thus making it highly extensible. Basically sClippy provides the framework that connects three types of plugins: