|Bob Mottram 19c78d3095 Tidying||5 years ago|
|man||5 years ago|
|src||7 years ago|
|LICENSE||8 years ago|
|Makefile||7 years ago|
|README.md||5 years ago|
|configure||8 years ago|
If you've been a netizen for some time then you've probably by now accumulated quite a lot of email. You can do basic kinds of search on your email archives, but it's also possible to do more sophisticated kinds of analysis which may reveal information which you weren't aware of and which (at least in 2012) isn't a routine feature of most email clients.
emailgraph is a command line utility which can take email in the form of a mailbox file or a set of text files and produce social graphs and other types of graphical representation. You can also produce subgraphs based upon particular keywords, so that you can investigate networks of people talking about a particular topic.
You can also restrict the search between dates or between particular times of day. Any dot file viewing program may be used to show the resulting graphs, such as Dotty, XDot or kgraphviewer.
Install from source as follows:
make sudo make install
If you wish to generate a Debian package see https://github.com/bashrc/emailgraph-debian
To plot additional graphs you will need to have gnuplot installed. For example:
sudo apt-get install gnuplot graphviz xdot
| Short | Verbose | Description | |-------+-----------------------+---------------------------------------------------------------------| | -m | --mbox <filename> | Mailbox filename | | -d | --dir <directory> | Directory to search for emails | | -c | --contacts | Print contacts | | -t | --top | Print the most active contacts | | -i | --influence | Print contacts in order of influence | | | --valence | Print contacts in order of valence | | | --sentiment | Show valence in plots and social graphs | | | --sentimentneg | Show negative valence in plots and social graphs | | | --start <dd/mm/yyyy> | Start date | | | --end <dd/mm/yyyy> | End date | | | --starthour <hour> | Start time in hours (0-24) | | | --endhour <hour> | End time in hours (0-24) | | | --dot | Print in dot file format | | -w | --linewidth <width> | Maximum line width in dot file | | | --nonames | Don't use names within diagrams | | | --monthly | CSV containing monthly email volumes | | | --yearly | CSV containing yearly email volumes | | | --daily | CSV containing daily average email volume for each hour | | | --weekly | CSV containing weekly average email volume for each hour | | -s | --sentences | Print sentences for subsequent semantic analysis | | | --plotdensity <image> | Plot email density to an image file | | | --plotdaily <image> | Plot daily email volumes to an image file | | | --plotweekly <image> | Plot weekly email volumes to an image file | | | --plotmonthly <image> | Plot monthly email volumes to an image file | | | --plotannual <image> | Plot annual average email volumes to an image file | | | --keywords <words> | Search for keywords | | | --corpus | Prints all emails which contain matching keywords | | | --title <title> | Title to be added to a plot | | | --min <number> | Minimum number of interactions | | | --gender <M/F/MF> | Select only males (M), females (F) or male/female interactions (MF) | | | --merge <filename> | Specify a merge file containing contacts to be merged | | | --exclude <filename> | Specify an exclude file containing contacts to be removed | | | --tests | Run unit tests | | -v | --version | Show version number | | -h | --help | Show help |
emailgraph -m mboxfile --dot > social-graph.dot
cat mboxfile | emailgraph --dot > social-graph.dot
This will recursively examine subdirectories and attempt to read all files within them:
emailgraph -d emaildirectory --dot > social-graph.dot
This is recommended if you are going to publish the graphs publicly and wish the names of individuals to remain private.
emailgraph -m mboxfile --dot --nonames > social-graph.dot
Where the date is given in dd/mm/yyyy format:
emailgraph -m mboxfile --dot --start 20/1/2007 --end 30/5/2009 > social-graph.dot
You can also specify a start date relative to the current date using the options:
Such as the following, to show a social graph for the previous 6 months:
emailgraph -m mboxfile --dot --start 6month > social-graph.dot
emailgraph -m mboxfile --dot --starthour 9 --endhour 17 > social-graph.dot
Keywords are not case sensitive and can also include names or email addresses. Multiple keywords are separated by commas.
emailgraph -m mboxfile --dot --keywords "foo,bar" > social-graph.dot
You can also use the --min option to show only those individuals with more than a given number of interactions. The --corpus option may be used to print all emails which contain matching keywords.
emailgraph -m mboxfile --keywords "foo,bar" --corpus > emails.txt
Dot files may be viewed using other utilities, such as dotty, xdot or kgraphviewer.
In some cases you may see multiple nodes in the social graph which correspond to the same person. This may be because their name is spelled differently, or because they have more then one email address. To make the graph neater and easier to interpret it is possible to manually specify the names or email addresses of individuals which you wish to be merged together. All entries are case sensitive.
To do this create a merge file such as the following:
J Smith,John Smith Clarke K,Clarke Kent Susan,Susan Jones
In each case the first name is the name as it currently appears in the graph and the second is the name that you wish it to be changed to. Then use the --merge option, for example:
emailgraph -m mboxfile --dot --merge mymerges.txt > social-graph.dot
Contacts can also be excluded from the social graph. This works in a similar manner to merging, but uses the --exclude option to specify a file containing names to be removed. Names are case sensitive. For example you could make a file like this:
Kevin Killjoy Mr Troll Doctor Strangecode
Then call emailgraph as follows:
emailgraph -m mboxfile --dot --exclude banned.txt > social-graph.dot
In order for this to work you must have gnuplot installed.
emailgraph -m mboxfile --plotmonthly monthly-emails.png --title "My graph title"
emailgraph -m mboxfile --plotweekly weekly-emails.png
emailgraph -m mboxfile --plotdaily daily-emails.png
emailgraph -m mboxfile --plotdensity email-density.png
You can also combine this with keywords and date or time limits, as required.
Usage and social graphs may also be produced for measurements of valence.
emailgraph -m mboxfile --plotmonthly monthly.png --sentiment
emailgraph -m mboxfile --plotmonthly monthly.png --sentimentneg
Social graph nodes may have their background shading altered depending upon the average valence values for emails sent by each individual:
emailgraph -m mboxfile --dot --sentiment > social-graph.dot
Here lighter background shades indicate more positive valence, and darker represent lower or more negative valence values.