beagle & soc07 03 Jul 2007 11:16 am
Rethinking the whole idea (I’m a bit split)
I’m starting to question myself if the current approach I’m taking is really worth all the fuss. We will always have the problems with parsing Mork files and the backend itself is going to be incredibly complex all in all. Also got a response from David Bienvenu yesterday and he suggests writing an extension instead of taking the path I’m currently walking. This is the way both Spotlight (which David has been involved with) for Mac OS X and Google Desktop does it. It seems very rational and would make things a lot easier. We would lose the capability to index when Thunderbird isn’t running, but we would actually be able to “index” things when beagle isn’t running (the design would be similar to the IndexingService backend), if I decide to implement it this way.
I know that some people read this by now and I would like your input in the matter. Should I go on as I intended from the beginning or take the alternative road and write an extension (which would require a less complex backend) instead?
Also, David is putting some effort in adding support for loading individual mails in Thunderbird from the command line, which is one of the things I’m going to need in the future. We’ll see how this evolves, but it seems promising.
on 03 Jul 2007 at 11:36 am 1.Andreas said …
As a formerly user of beagle to index thunderbird mails, i would say: go for it!
What bugged me the most, was the memory footprint and cpu usage of beagle because of indexing thunderbird mails.
i think most of us are running thunderbird all the time and i see no problem in having to run thunderbird to get the latest mails into beagle index.
even if someone has a problem with this, i think people would have a even bigger problem with beagle consuming much cpu/memory because of thunderbird backend.
btw maybe i’m missing the point, why is it so hard to parse mork, if thunderbird saves the mails in this format, there should be the code in thunderbird source, so it shouldn’t be a problem at all?! but as i said, i think i’m missing the point here
on 03 Jul 2007 at 11:45 am 2.Pierre Östlund said …
Thanks for you replay Andreas! I understand exactly what you are saying here. The memory footprint bugged me as hell when I wrote the first backend. Using an extension would give me more control over the indexing process in a way and also put the memory foot print on Thunderbird instead of beagle, which would be a better approach.
The thing about Mork is that the implementation of it is sooo complex. The _source code_ itself is over one megabyte in size and is all written in C++, making it pretty much impossible to P/Invoke into C# since there are no static symbols to invoke. So a new parser is pretty much the only alternative here. That’s why Mork is a hell.
on 03 Jul 2007 at 12:20 pm 3.Andreas said …
thanks for your explanation regarding mork. 1 mb source for mork sounds really huge
so i understand your approach.
so imho, go for the thunderbird extension
on 07 Jul 2007 at 3:25 pm 4.Fredrik said …
As jwz wrote about Mork a couple years ago: “Mork, which is — and I do not use these words lightly — the single most braindamaged file format that I have ever seen in my nineteen year career”.
If we look at general trends for the Mozilla tree as a whole, Mork is probably going the way of the Dodo sooner rather than later. With Places, Firefox killed off Mork use completely. SeaMonkey is doing the same thing with the move from XPFE to Toolkit. Everyone’s trying to use mozStorage instead, which is basically an XPCOM layer on top of SQLite.
It wouldn’t be surprising if Thunderbird is going to do the same thing eventually.
on 07 Jul 2007 at 4:07 pm 5.Aron van Ammers said …
I support the idea of using an extension as well. Your code would be more simple and straightforward, which is good. For me it wouldn’t be a problem to have Thunderbird open most of the time.
And another thought: in the long term it’s not unthinkable that someone could write some kind of daemon executable for TB which could call TB functionality (e.g. extensions) without TB running. At least not unthinkable to me, not hindered by too much knowledge about the architecture of TB
on 07 Jul 2007 at 10:20 pm 6.Pierre Östlund said …
I’ve heard that phrase somewhere too, Fredrik. I can pretty much agree on it too. The concept of Mork is good (not duplicating data and all that) but the way it is done is not entirely optimal. It is for instance a lot easier to read binary stored data compared to textually stored data - where Mork does the latter one. SQLite would probably be the way I’d take here. My bet is that the Thunderbird developer some day will switch to it too.
Not sure if writing a “daemon”-like application so that we can perform indexing while Thunderbird is such a great idea. We would for instance get data sharing problems if the user decides to start Thunderbird while this daemon is running. Shutting down the daemon and starting it again when Thunderbird is shut down would be the most feasible way to work around that (not very optimal IMHO). We would also get a lot of extra code to maintain and another process that lies around eating memory. Don’t know how much the core needs when it’s stripped down, but it won’t be a freebie. Still, the idea is good but we won’t see this with Thunderbird though. IIRC Evolution does something like this using a “data server” to share contacts (maybe other things too?) between applications.
on 18 Aug 2008 at 8:55 am 7.Xanax 2mg. said …
Xanax presciption….
Xanax….