beagle & soc07 18 Jul 2007 06:12 pm

beagle Thunderbird extension is out!

As promised yesterday: here it is :-) If you intend to try this extension out, continue read this post as it contains a lot of important information. How the extension works will be explained to the extent needed when testing.

How it’s done (just the very basics)

For starters, this extension works similar to the Firefox extension. The extension itself is not aware of beagle. It does not communicate with the beagle daemon. Instead the extension produces small metafiles in a specially selected directory. This directory will be monitored by beagle and files stored/created in this directory will be parsed and indexed at some point by beagle.

The “destination directory”, to which the extension will write files, is stored inside the beagle Thunderbird index directory and is called ToIndex. Most users will find this directory in ~/.beagle/Indexes/ThunderbirdIndex/ToIndex. When new data is indexed, this is were it will be stored until beagle does its magic.

Keeping track of indexed data

Once data has been indexed it’s marked as indexed. This makes sure we don’t index the same data set twice and also speeds the process a lot. Speeding things up is especially important when using an extension since we can only index data while Thunderbird is running. The marking system used allows the extension to start of where it last left off. Once everything has been indexed a first time, then there’s only immediate updates. Which is nice. Another nice thing is that we can index data when beagle isn’t running.

But how does Thunderbird know that beagle has all of its data? Thunderbird can’t be sure. This can’t be totally ignored however (and it isn’t). I’ve put a small check into the extension that figures out if the ToIndex directory exists. If if does, then the extension will assume that everything is fine and continue with the indexing process normally. But if it doesn’t exist however, then all data is marked as “not indexed” before indexing. The directory will be created as well. This solves two of the bigger issues: when ~/.beagle or ~/.beagle/Indexes/ThunderbirdIndex are removed. Everything will be re-indexed in case you decide to do any of it.

Now it’s time to add a note about the meaning of “indexed”. It is very important to understand what it means in this context. What’s happening when something is indexed in Thunderbird is that these small metafiles are produced (as explained above) and they will be processed by beagle when beagle has time to do so. This could be within a few seconds but also a few minutes or even hours. The normal case will probably be within a few seconds/minutes once the initial indexing phase is over. The initial indexing phase ends once Thunderbird has created metafiles for all your data and beagle has indexed all of them. So, expect that it might take some time before things ends up in beagle before this phase has ended.

Note: There’s no Thunderbird backend in beagle yet, so no data will end up in beagle as of this moment. Above is just theory. Creating this backend is the next step and I will begin working on this within the next couple of days.

Using the extension

The extension adds a couple of things to the table once installed. It is automatically enabled and will begin to index your data and in most cases you don’t have to do anything. But there’s a few features built-in worth knowing about. Mainly privacy features.

In the right bottom corner you’ll see the famous beagle dog (if the installation was successful). It will indicate if the extension is enabled or disabled and you’ll clearly see what state you are currently in. Just click this icon if you want to enable or disable indexing.

You can right-click any folder and select Never index this folder if you want that folder never to be indexed. No data is removed from beagle when you do this, so you’ll manually have to remove anything already indexed. Just right-click the same folder and select Remove folder from index to do so. Be sure to answer No in the dialog window that pops up, as if you answer Yes the Never index this folder flag will be removed as well. This applies in general when removing content from the index. Options similar to these will be available for individual emails and others once I figure out how to do this (I’m having some problems with this overlay in particular).

Lastly, there’s a small settings dialog that you can use to change various settings. Just go to Tools->Beagle indexing settings to show it. You have three pages: Indexing, privacy and status. Here’s a small explanation of each page:

  • Indexing - You can use this page to enable or disable the indexing process. Just check or uncheck the check box. But more importantly: you can change the indexing speed from here. Just change to whatever suits your need. Beta testers should play around here a bit, more information at the end of the post.
  • Privacy - In case you want to disable an entire source, i.e. you don’t want to index any POP3 emails, you can do that from here. But you’ll also find some potential rescue options here too. In case you want to remove everything from beagle’s index, just press the Drop everything button. Note that the backend will immediately begin the indexing process after you’ve done this, so make sure you disable the indexing process before pressing the button if you don’t want anything to end up in beagle again. The Reset index status is quite useful if you want re-index everything without dropping things from beagle’s index.
  • Status - This page will display some information about the indexing process. Amount of items added and/or removed from beagle’s index will appear here as well as how many things that are currently queued up. You’ll also see if the extension is idle or if there’s more things to index. Great way to see if the initial indexing process has completed (Indexing status should say Idle).

Known issues

There are currently a few known issues that you guys don’t have to report as bugs:

  • The main loop is currently running at all time looking for data to index. This isn’t expensive in any way but it will make Thunderbird wake up a lot and adds to power consumption (laptop users)
  • The pages in the preference dialog might appear mixed up. I don’t know why this is happening because it shouldn’t. They all show up correctly in CVS version, so I’ll just hope everything works correct in the next Thunderbird version.
  • There’s currently no way of excluding individual items from the indexing process, like with the folders. This is of course planned but I just can’t get the menu items to show up.
  • When removing or unindexing content, a small window will pop-up and show the progress (since it can take a couple of seconds with a lot of content). Unfortunately this window isn’t threaded so it will only show up after everything is done (you might see a small window flash by right after installing the extension, that’s this window) and won’t show any progress.
  • Thunderbird currently lacks implementations for notifications about when folders are renamed and messages are removed from the Trash-folder. I cannot provide these feature as of today but David Bienvenu over at the mozilla project is looking into this and he’ll implement this some time (don’t know when, maybe it’s already implemented?).
  • No about box…

My intention is of course to fix all these issues but I will give them lower priority until the end of summer. There are more important things to deal with right now (like creating the backend so that data ends up in beagle at all).
Notes to testers (important)

In order to get this extension work, you’ll have to make sure that the ~/.beagle/Indexes/ThunderbirdIndex directory exists. It won’t start if it doesn’t. Just either create it with your favourite file manager or by typing the following command from a terminal (this directory will be created by beagle in the end, so it’s just the temporary solution):

mkdir -p ~/.beagle/Indexes/ThunderbirdIndex

The Error console is your friend and it’s the first place you should check out in case you are suspecting something is wrong. You’ll find it in the Tools menu. You can also enable the dump function which will print some things to the terminal. The easiest way to do that is to just open the Error console and paste the following line into the text box and pressing enter (this is one line):

Components.classes [’@mozilla.org/preferences;1′].getService (Components.interfaces.nsIPref).SetBoolPref (’browser.dom.window.dump.enabled’, true);

Note that you won’t get any notification about success here. When you want to disable this, do the same thing but change true at the end of the line to false. In order to see the messages you must run Thunderbird from a terminal. Just open a terminal and run thunderbird or mozilla-thunderbird (which it is depends on distribution). If you have downloaded Thunderbird from mozilla.org and run it from a standalone directory, just cd into that directory and type ./thunderbird to get yourself going.

A request from me to all testers is that I would like it if you tried out various indexing speed settings (found in the preference dialog). The values I’ve used are totally arbitrary and not good at all. I can for instance run the unrecommended setting Instant with no apparent CPU usage at all but it’s not a very fast setting (not as fast as the title claims to be at least). You can mixture with the settings by selecting Custom. These are the settings you’ll see when doing so:

  • Batch count - The amount of objects to process each time the main loop is working. You can think of it as: “every Batch delay process Batch count items”.
  • Queue count - Amount of items that needs to be in the queue before it automatically empties itself.
  • Batch delay - Described above. Measured in seconds.

Try out various settings and see what happens. You can use the Reset index status button in the Indexing page when you want to restart the entire indexing process (when trying settings out). The Status page will tell you when the everything is done.

Be sure to also check out the metafiles created to make sure they contain what they should (they are stored in the ToIndex directory mentioned earlier). An ordinary object looks something like this:

<MailMessage>
<Author>Some author (some.author@some.domain.com)</Author>
<Charset>ISO-8859-1</Charset>
<Date>1165271435</Date>
<Folder>Inbox</Folder>
<FolderURL>imap://user@server/INBOX</FolderURL>
<HasOffline>false</HasOffline>
<MessageId>some-id@some.domain.com</MessageId>
<MessageSize>4085</MessageSize>
<OfflineSize>0</OfflineSize>
<Recipients>Some user (some.user@some.domain.com)</Recipients>
<Subject>Some subject</Subject>
<Uri>imap-message://user@server/INBOX#1</Uri>
</MailMessage>

RSS feed entries looks exactly the same, but FeedItem is used instead of MailMessage to tell them apart. When removing a folder, it may look something like this:

<DeleteFolder>
<FolderURL>imap://user@server/INBOX</FolderURL>
</DeleteFolder>

Deleting messages looks similar too, but DeleteHdr is used instead of DeleteFolder and Uri is used instead of FolderURL. I can add that when removing everything in a folder, only one file (like the one above) will be created instead of one file for each object in the folder since this is much more efficient. You can manually remove the ToIndex directory to force a re-index and you can also remove all files inside the ToIndex directory if you want to (nothing bad will happen if you do at this time, you should not do this when the backend is implemented however). Might be a good idea to clean out every once in a while when trying out smaller things, like moving message/folders around or removing something just to see if the correct files are generated.
Download

The XPI that you need in order to try this extension out is available below. The source code is available in beagle’s SVN tree here. The minimum required version of Thunderbird is set to 2.0, so you’ll need that.

Thunderbird extension v0.1

A few words at the end

I’m leaving for a couple of days tomorrow morning and I will have limited Internet access these days. Feel free to comment this post with bugs, requests or whatever you feel like it’s worth for me and others to know about the extension. I’ll check in when I get the chance to do so.

Good luck and happy testing :-)

Update

Here’s the updated version of the extension. Hopefully it will fix some of the bugs you’ve got so far:

Thunderbird extension v0.1 update

10 Responses to “beagle Thunderbird extension is out!”

  1. on 19 Jul 2007 at 8:13 pm 1.Andreas said …

    Hi Pierre, as promised, i’m giving it a try. I even installed thunderbird 2.0 on two pcs to test it ;)
    so far i don’t have any cpu or memory hogging at all. but as you said we should test the speed seetings: is it possible that it doesnt have any effect at all? the only difference ist that it says putting 20 in queue instead of 10.
    error console prints much “ns_error_failure” within msgHasOffline and addToIndex Line 254. unfortunatly i cant copy it out of the error console.
    and i have the feeling that indexing doesnt really work. at work i ended up with over 120k files “objextXXXXXX” and here at home the info shows 8xx files indexed. 8xx in queue and on my drive are only 210 objectXXX files….
    i hope i can take a look at it later today

  2. on 20 Jul 2007 at 11:18 pm 2.Rogue said …

    Hi Pierre,

    I installed your extension and things seem to be running smooth, except for the following exception in the error console, that seems to be repeating itself unendingly:

    Error: [Exception… “Component returned failure code: 0×80550006 [nsIMsgFolder.getMessages]” nsresult: “0×80550006 ()” location: “JS frame :: chrome://beagle/content/beagleMain.js :: anonymous :: line 185″ data: no]
    Source File: chrome://beagle/content/beagleMain.js
    Line: 185

    Hope you will be able to patch this easily.

    later

  3. on 21 Jul 2007 at 4:26 pm 3.Wal said …

    Similar results for me, but the extension only appeared to index new mail as it arrived, not the existing emails I have stored. Does the extension cater for non-standard mail folder locations? I have my tbird prefs.js pointing to a different hard drive and directory where I store my emails.

    Cheers, Wal

  4. on 24 Jul 2007 at 7:01 pm 4.Pierre Östlund said …

    I’m going to try to answer all of you here (and concentrate on the issues):

    Andreas:
    The exceptions you are seeing here are quite strange to me. But I’ve put a small try-catch block to catch these and default to other values in case we run into them. My guess is that these exception are the reason you get so many object files as well. The files are created, the mails are marked as indexed but changes are never committed to database. Hopefully this is fixed now.

    Rogue:
    The exception you are seeing are usually thrown when trying to access content in a folder that has no “defined” content. This usually falls back to “root” folders, like the “Local Folders”-folder. This can easily be fixed by checking if the folder has content before trying to access it, which I’ve done in the latest XPI.

    Wal:
    You have a pretty strange problem here. Does the extension not generate object files for you? The location of your content should not matter, Thunderbird should take care of that by itself (I don’t request for content in a specific place, I just say that I want all folders associated to one account and then the underlying code takes care of the rest). If you right-click a folder that aren’t indexed, do you see the “Remove”-alternatives? Also, you could try removing the ToIndex-directory and see if your content is indexed correctly?

    You should all try the latest XPI, added to the bottom of this post (remove the old extension first and the ToIndex directory as well). It hopefully fixes some issues but it also print a new more lines, for instance which folder is currently being processed and when indexing of a folder is complete, so that we can debug even further.

  5. on 25 Jul 2007 at 2:21 pm 5.Andreas said …

    i just installed the update and it looks _VERY_ promising. i see changes in speed immediatly when changing them in the preferences. and the object files stay in an amount thats related to the real mail count :)

    so big thumbs up from here :)

  6. on 25 Jul 2007 at 7:01 pm 6.Pierre Östlund said …

    That’s great news Andreas :-)

    I’ve done some other modifications now as well that hopefully will fix some other bugs. But I’ll blog about that later and include a new XPI for that.

  7. on 30 Jul 2007 at 6:54 pm 7.Dinesh said …

    Hi Pierre,

    I installed the 0.1 version today morning and followed the instructions. I get the following error message:
    Error: [Exception… “Component returned failure code: 0×80550006 [nsIMsgFolder.getMessages]” nsresult: “0×80550006 ()” location: “JS frame :: chrome://beagle/content/beagleMain.js :: anonymous :: line 185″ data: no]
    Source File: chrome://beagle/content/beagleMain.js
    Line: 185

    I don’t see any “Remove” alternatives on any folder so far. Also ~/.beagle/Indexes/ThunderbirdIndex/ToIndex is empty.

    Thanks for any help,

    Dinesh

  8. on 03 Aug 2007 at 2:28 pm 8.Id2ndR said …

    I tried it on Ubuntu Feisty.
    Beagle index status was inactive just after restarting Thunderbird (2.0). When I activated it, it just come back to inactive few seconds after.
    I created directory ~/.beagle/Indexes/ThunderbirdIndex and then I could activate it.

    After doing this, beagle show me that 86 folders remain to be indexed but doesn’t seem to index them. It just index new e-mail that arrived as others said before.

  9. on 03 Aug 2007 at 2:44 pm 9.Pierre Östlund said …

    Hi Id2ndR,

    The ~/.beagle/Indexes/ThunderbirdIndex directory must exist for the extension to work, yes. Beagle creates this directory if the Thunderbird backend is available, so this will be taken care of once you have beagle installed with the Thunderbird backend.

    Did you use the extension from this post? Because if you did, uninstall it try with the latest extension instead (available in my latest blog post). It contains a lot of bug fixes and should also be able to tell you about errors. Let me know your status here once you have (or if you already are using that version of the extension so that I know where to start).

  10. on 19 Aug 2007 at 4:46 pm 10.Id2ndR said …

    Hi Pierre,

    I used thunderbird extension from this post. I don’t got TB backend so that’s the trouble. I used beagle version provided in Ubuntu’s repositories which isn’t the latest one.

    I’ll search for an other repository that may contain an other packaged version of beagle rather than compiling trunk version.

    Thanks for your response.

Trackback This Post | Subscribe to the comments through RSS Feed

Leave a Reply