Archive for October, 2005
Beagle sure has come a long way in terms of maturity over the last few months.
I’ve been getting involved with Beagle’s interaction with dotLucene which is the C# port of Apache Lucene – a very powerful text search architecture. Beagle stores text content of indexed files within Lucene ‘databases’ and uses Lucene’s impressive search features to query on behalf of the user.
We previously used dotLucene 1.4.3 within Beagle, but I recently upgraded us to 1.9 RC1. Beagle is mostly unaffected by the changes, but there are some bug fixes and optimizations included. Perhaps the biggest win was the result of my extensive testing to make sure the upgrade didn’t break anything – I did identify and fix two bugs, and they were both also present in the 1.4 code.
The first bug was a file descriptor leak in a common code path (inside Beagle code), and the other, a fairly significant locking bug which was causing the locking often to not be having any effect at all. This explains some of the strange behaviour that has cropped up time to time in the past which we’ve never been able to pinpoint.
I also looked at some traces through the codepaths. I noticed that dotLucene was dealing with throwing and catching exceptions a hell of a lot – hundreds of exceptions being dealt with while indexing a small range of files. dotLucene was using exception catching where simple if/else combinations would work just fine. Exception handling is expensive as the runtime must jump through hoops keeping track of where to jump to if a certain type of exception occurs, so by greatly reducing the amount of exception handling that takes place, we have a nice small optimization in place.
After landing dotLucene 1.9, I’ve now turned some attention to another aspect of Beagle’s data storage mechanism. Beagle uses SQLite to store file attributes when extended attributes are not available, and for its file text cache.
Currently, Beagle only uses SQLite 2.x. Attempting to ‘port’ it to SQLite 3 revealed a problem in our SQLite interaction. You must always query a SQLite database from the same thread that the connection was originally established. Beagle is multi-threaded and we are using the same connection over multiple threads, which is (apparently) unsafe, and SQLite 3 explicitly checks this and returns error if you go beyond the original thread.
This creates a non-trivial problem to solve, and is a poor design decision from the SQLite developers. We’re going to stick with SQLite 2.x-only, as it seems to work just fine even despite sharing the connection over our thread pool. SQLite 3 wouldn’t bring any major benefits to us, and we are unable to use it due to its new explicit thread checking restriction. Sigh.
Someone out there doesn’t like me…
Back at Easter-time when everyone else recieved loan renewal forms, mine did not arrive. I attempted to apply online instead, which didn’t work out as I’d never recieved my login details. I phoned up again and was told I would recieve my login details through the post within a week.
Two weeks later, no login details arrived, so I printed off the paper version, filled it in, checked very carefully, and sent it in a week or so after the deadline. No problem, I was assured, “you’ll still get your loan on time”.
Fast forward a month or so, I re-request my login details so that I can check the progress of my loan online. The login details arrive, I login, my application has been accepted, job done.
Now that term has started, I realise that my loan has not materialised. After phoning them, I recieve an array of reasons why this is the case, including that I didn’t even apply for it, I didn’t sign the loan request form, I made a mistake on the application form but they can’t tell me what the mistake is, …. If I did make a mistake, how nice of them to tell me. And why does it say application accepted on the website?
I’ve sent in another renewal form, and depending who answers the phone, I either will get my loan within a few weeks from now, or I won’t get it at all for this term. Useless.