Aside from this blog I maintain a handful of other publications with feeds, and for convenience I have them aggregated in Mozilla Thunderbird v2.0.0.16 alongside all my mail. It's not a completely solid piece of software but it does the job. However, I managed to find one its foibles just the other day.
On one of my publications the "title" holds important metadata about an entry. In this case the item that the publication is about had just entered a new phase, and although we were already three days into it I decided to put a day count in entry titles from thereon in. Needless to say, I wanted the existing three entries to have day counts in the entry titles for consistency, so I decided to go back and edit them. That was fine.
But then I went into Thunderbird and the old title tags were still on the existing entries. Well, of course. I wouldn't expect any RSS aggregator to re-retrieve a post that had already been stored. They deliberately don't do that. So I figured if I just deleted the old entries, they'd be re-downloaded the next time Thunderbird sync'd with the internet.
No. They weren't.
Poking around in Thunderbird's internals can often be fun and it's surprisingly easy, thanks to the filesystem-based message storage. I have my profile stored in F:\thunderbird
, and I soon deduced that the file F:\thunderbird\Mail\News & Blogs\feeditems.rdf
was responsible for 'remembering' the UIDs of the most recently downloaded entries, thereby ensuring against duplicates.
So I figured, if I just erased the file's contents, the cache would have to be re-built, the last five entries would be re-downloaded, and I could simply delete the duplicates manually this one time. That's exactly what happened, the first time. And the second time, and the third… in fact, the cache no longer seemed to be working at all, even though it was being reconstructed inside feeditems.rdf
.
I struggled with this for some time, recreating subscriptions, shuffling folders and even removing the entire post history. Nothing worked. In frustration my finger began to hover over the "delete" key and, as it happens, that file was selected in Directory Opus at the time. feeditems.rdf
was gone, and suddenly the "new mail" alert sound stopped ringing in my ears. The duplicates had stopped piling up on each other.
I checked F:\thunderbird\Mail\News & Blogs\
and, indeed, feeditems.rdf
had been recreated and reconstructed, and this time it was working properly. To get a clean setup back I purged my feed folder, deleted feeditems.rdf
, did a Thunderbird sync and everything was how it should be. My blog entries were back and they were not duplicating.
Conclusion
As it turned out, if feeditems.rdf
was empty or otherwise did not contain valid RDF XML, the caching system gets totally broken. But as long as you remove the file entirely it will be reconstructed on next sync. I can only imagine that there's some flag inside the application that isn't being set properly if feeditems.rdf
doesn't physically need recreating on disk, even if its contents need reconstructing.
And also as it turns out, a far better way of re-grabbing a blog entry is to just remove its <RDF:Description/>
tag from feeditems.rdf
. I'm not really sure why I didn't think of that before.