Semanta


Introduction


semantalogo.png

"Semanta is seriously not fond of Drupal!"

What? Semanta is a group of add-ins/extensions to popular existing Mail User Agents. As of yet, Semanta has been integrated as an add/in extension to Microsoft Outlook and Mozilla Thunderbird. Based on the knwoledge in the sMail Conceptual Framework, Semanta supports the user with the management of workflows (e.g. Meeting Scheduling, Task Delegation, Event Announcments, Information Exchange etc.) (co)executing in email threads. Semanta provides for the following useful features.

Keeping track of Email Action Items Via Semanta, the user can easily keep track of email with action items that require attention (Fig. 1a,b), as well as be automatically reminded of any required replies or other actions. Rather than flagging the whole email as read or unread, Semanta considers the individual action items within, and flags/marks the item appropriately.

inbox.png

Figure 1a: Items having pending action items (Outlook)
ThunderbirdInbox.png

Figure 1b: Items having pending action items (Thunderbird)

Action Item Handling Support The user is supported with the handling of each action item, based on its type (Fig. 2a,b).

process.png

Figure 2a: Support for handling individual action items (Outlook)
processThunderbird.png

Figure 2b: Support for handling individual action items (Thunderbird)

Detecting Generated Tasks/Events Semanta assists the user with exporting and saving events or tasks that were generated on-the-go as part of the email communication (Fig. 3).

activities.png

Figure 3: Exporting and saving email-generated tasks/events

Email Object Linking Semanta makes email processes more visual, by linking emails to contacts, files, folders, as well as to task and events generated withing (Fig. 4). Links between emails in threads are also stored, enabling the user to traverse up and down email threads (Fig. 4).

eventLinking.png

Figure 4: Linking a generated Calendar item to its email of origin

Email Attachment Reminders Semanta reminds users who were meant to attach a file but have forgotten to do so, on sending an email.

attach.png

Figure 4b: File attachment reminders

Workflow-based Visualisation of Email
Semanta's support does not stop with each individual action item. On the contrary, Semanta keeps track of Email Workflows within which these action items occur. Semanta visualises these workflow, so that the user can keep track of both the individual action items (e.g. incoming Action Items that the user still has to take care of, or sent Action Items for which the user still awaits a reply) but also of the overall workflows within which these items take place. An example of a pending incoming action item, shown in its context (or the workflow - on the right) is shown in Figure 5a,b

actionitems.png

Figure 5a: Semanta Email Workflow Visualisation (Outlook)
workflowsthunderbird.png

Figure 5b: Semanta Email Workflow Visualisation (Thunderbird)

How? Semanta is based on the three models provided by the sMail Conceptual Framework. Relevant action items (speech acts) are semi-automatically annotated via a Text Analytics Service. The user can then review, change or create new annotations via the Semanta Annotation Wizard (Fig 4a,b).

review.png

Figure 4a: Reviewing Automatic Annotations
wizard.png

Figure 4b: Creating/modifying annotations with the Wizard

The sMail Speech Act and Speech Act Process models have been encapsulated within the sMail Ontology. Metadata about the email (including email thread information and content speech act annotations) is represented through the use of this ontology in RDF. This information is invisibly transported alongside the email content in the email headers. Semanta is grounded within the Social Semantic Desktop, thus enabling the semantic linking of email items and artefacts within (events, tasks, contacts, files etc) to custom conceptualisations of projects, groups and so forth.

The Semanta GUI can then provide all the functions described earlier to the user, given the transported metadata and the sMail Speech Act Process Flow model. The transported speech act annotations, if addressed to the user, become action items requiring attention. Depending on the nature of these annotations, the recipient is assisted with addressing the action items.


Architecture


Figure 6 illustrates the general architecture for Semanta. The business logic is completely separate from the GUI. The former is available within the Semantic Email service. This service uses the Text Analytics service to provide the semi-automatic annotation of email content. In turn, both services have access to the knowledge of the sMail models via the sMail Ontology. This ontology re-uses general concepts from other Semantic Desktop ontologies, e.g. Nepomuk Contact Ontology (NCO), Nepomuk Message Ontology (NMO), Nepomuk Annotation Ontology (NAO) and so on. Given the complete separation of the business logic and the GUI, Semanta can easily be implemented as an extension to any number of existing MUA's. Semanta can thus support different users using different MUA's and even platforms, since the underlying knowledge representation language used is RDF. On the Semantic Desktop, this means that email data can very easily be exported/mapped/linked to data on the user's desktop.

architecture.png

Figure 6: Architecture

Text Analytics Service


The text analytics service performs classification of freetext emanating from electronic conversations into speech acts, by considering the following linguistic and grammatical features :-

  • Linguistic Modality (Sentence & Verb)
  • Verb type
  • Semantic Role
  • Grammatical Tense
  • Negation

Details of the technology will be included after this work is published. The service is implemented as an ANNIE Conditional Corpus IE Pipeline in GATE. We took a Knowledge Based approach to IE whereby Language Resources were created manually by a Language Engineer based on consultations with a Speech Act Domain Modeler. Language Resources in the pipeline consist of the default gazetteer word lists for Person Named Entities, verb lists associated with the relevant verb types, modal verb lists, etc. Other miscellaneous linguistic features such as negation and anaphoric pronouns were also recorded as gazetteer entries. An overview of the IE pipeline is shown in Fig. 7. The pipeline consists of a:

  1. GATE English Tokeniser
  2. Modified Sentence Splitter
  3. Hepple POS Tagger (Assigns Part of Speech category to each token)
  4. ANNIE Gazetteer (finite state lookup for all verbs and key-phrases)
  5. NE transducer (Named Entity identification, e.g. Location, Person, Date)
  6. JAPE Preprocessing (separates Gazetteer Lookup annotations into individual annotation sets, i.e. action items)
  7. Simple Anaphoric Coreference (chains anaphoric pronouns to Person annotations)
  8. Speech Act JAPE grammars (the cream of the classification rules)
    1. pipeline.png

      Figure 7: GATE Pipeline for the text analytics service

      Speech Act identification is mostly done in the latter component, whereby JAPE pattern rules benefit from previous linguistic/semantic annotations to perform speech act annotation at the sentential level. Speech Acts are extracted based on a combination of hand-coded JAPE grammars and a Finite State Gazetteer Lookup of trigger phrases.

      The results of an evaluation of the Text Analytics technology are included in the Evaluation Section below.


      Demonstrations


      The following are the three most up-to-date videos for Semanta (Thunderbird), dating October 2010.

      1. Semanta's Workflow-based Email Visualisation
      2. preview1.png
      3. Semanta's intelligent support for workflows
      4. preview2.png
      5. Semanta's integration of workflow artefacts
      6. preview3.png

      You can also view a demo portraying an email exchange between a certain Dirk and a certain Claudia. Claudia prefers Thunderbird and Linux, whereas Dirk uses Outlook for email exchange and his platform is Windows. This demonstration was presented in the ESWC 2009 demo session

      NOTE: This Demo dates from July 2009, after which a number of important improvements/enhancements were performed - most importantly the novel email workflow visualisations

      preview.png

      Evaluation


      September 2008 Formative Evaluation. In September 2008 we carried out a formative evaluation of Semanta in Microsoft Outlook, with 6 users who were introduced and instructed to carry out a number of email tasks. The results of this evaluation are included our ESWC09 publication. The evaluation presentation, questionnaires, results and user videos are available here.

      August 2009 Summative Evaluation. After enriching Semanta with additional features and undergoing further testing we performed a summative evaluation to establish whether Semanta provides any real benefits to the average email user. Given it's cross-platform application, the evaluation was carried out using Thunderbird plugin. Details of this evaluation will be published pending publication of this work.

      Text Analytics Evaluation. Results of this evaluation are published in the LREC 2010 publication below. Although modest, the results are satisfactory as a first step, especially considering that the rate for the inter-annotator agreement between humans in an earlier expereminet was found to be 0.811 (Check out the sMail Conceptual Framework for more information). In short, the results are as follows. 12 evaluators ran the classification rules over a total of 116 emails, and rated 194 classified action items. A further 74 items were manually annotated by the user. Positive ratings, representing correctly classified action items, amounted to 41%. Negative ratings, representing false positives, amounted to 31%. Missing action items amounted to 28%. We obtained an F-measure, weighing precision (0.56) and recall (0.60) equally, of 0.58. One can try out the technology here.


      Publications


      Simon Scerri, Gerhard Gossen, Siegfried Handschuh
      Supporting Digital Collaborative Work With Semantic Technology: An Email Use Case.
      In Proceedings of the International Conference on Knowledge Management and Information Sharing.
      KMIS2010. (to appear) Valencia, Spain, October 2010.

      Simon Scerri, Gerhard Gossen, Brian Davis, Siegfried Handschuh
      Classifying Action Items for Semantic Email.
      In Proceedings of the 7th International conference of Language Resources and Evaluation.
      LREC2010. Valletta, Malta, 2010.

      Simon Scerri, Brian Davis, Siegfried Handschuh, Manfred Hauswirth
      Semanta - Semantic Email made easy.
      In Proceedings of the 5th European Semantic Web Conference (to appear)
      ESWC2009, Crete, Greece, 2009. SLIDES

      Simon Scerri, Ioana Giurgiu, Brian Davis, Siegfried Handschuh
      Semanta - Semantic Email in Action.
      (demo) In Proceedings of the 5th European Semantic Web Conference (to appear)
      ESWC2009, Crete, Greece, 2009. DEMO

      Simon Scerri, Siegfried Handschuh, Stefan Decker
      Semantic Email as a Communication Medium for the Social Semantic Desktop.
      In Proceedings of the 5th European Semantic Web Conference
      ESWC2008, Tenerife, Spain, 2008. VIDEO

      Simon Scerri
      Semanta - Your Personal Email Semantic Assistant. (yes, it's a horrible title)
      (demo) In Proceedings of the Intelligent User Interface Conference 2008
      Maspalomas, Gran Canaria, 2008.


      Downloads & Trials


      Following an extensive evaluation period in August/September 2009 we are currently working on improving the current Thunderbird prototype. Contact us directly if you are interested in trying out a beta version.



      Mailing List


      Email the List:
      semantic-email@lists.deri.org

      Subscribe to the List:
      Click Here to Subscribe


      Contact


      Simon Scerri (Main Contact)
      Microsoft Outlook add-in Developer & System Designer

      Gerhard Gossen
      Mozilla Thunderbird Extension Developer & System Designer

      Brian Davis
      Text Analytics Service Support

      Siegfried Handschuh
      Scientific Supervisor

AttachmentSize
preview.png28.77 KB