ExpertFinding on Emails

The goal of this exercise is to extract keywords from emails and use then to deduce the expertise of a person and track the changes for a period of time.

How can you help?

  • I need you to download the Expertfinder.zip file and extract the contents to a local folder.
  • Then make sure the ExpertFinding.jar is executable and run
    >> java -jar ExpertFinding.jar
  • This program would ask your email credentials and upon successful authentication, and you would be able to select the imap folders I can index. Then the program starts processing your emails one by one. This may take a while 30 mins to 1 hr depending on the size of your inbox. The content of the emails is not stored at any point of time.
  • On completion of the process you will find two new files /models/{username}.mdl and /reports/{username}-keywords.txt . I need you to send me these files by email or any other way possible.

    The *.mdl files are java serialized objects which stored some of the information from your emails (details given below) . The keywords file includes a list of keywords extracted from your emails for your inspection.

What information is gathered and why?

The following information is gathered from each of your email

  • username : to identify the account holder
  • folder : to filter emails according to folders
  • subject : used as short textual representations of the emails while generating reports
  • sender : to keep track of conversations between people
  • receiver : same as above
  • date : temporal analysis
  • keywords : each email is analyzed and keywords extracted, to be used for concept based analysis of expertise.

Also the source code is also available for inspection on the websvn

What is done with the information?

An example of an expertise graph against time for Pradeep during the past nine months is shown below.

ExpertFinding-Pradeep.png

AttachmentSize
ExpertFinding-Pradeep.png19.36 KB
ExpertFinder.zip34.88 MB