First results of optimization
July 17, 2008
In this post I will describe impact of new more efficient data structure on calendar part of EDS. The new data structure is our famous interval tree. Imagine, that we are maintainers of calendar data and enemy is asking us expensive questions about our data, mostly “give me events in some time range”. In the past, we took naive approach to answer questions. Now, we are doing more wisely.
On the attached picture is shown, how many events need to be scanned to evaluate query. Red color are queries BEFORE optimization – either 1271 or 346 items were scanned (I have two calendars with 1271 and 346 items and before optimization, all events are scanned to answer question). The green line shows, how many items need to be scanned after optimization.
One can notice, that there is green (optimized) query on the beginning that was doing scan through all events. This is query, in which no subexpression of from *occur-in-range* was
found – it was not question of form “give me all events in some time range”. Therefore, interval tree did not help us.
More results:
- More numbers – how many took to evaluate query ?
- How many items was tested (sum of 423 queries) ? For every tested event we had to generate recurrences.



That’s quite impressive! Keep up the awesome work.
Neat!
Nice. Maybe I won’t be switching to GMail after all
Could you give some more detail about the first query?
looks good! wanna have a beer at prague in august?
That’s awesome. Simply awesome. Can’t wait for it.
Thank you for your interest!
I will write details about the queries in next post, hopefully today – I was not sure, how to assign time-data to query, so I would appreciate all comments…
These comments are very motivating, thank you
Andre, if you will be here at Prague in August, we should meet and drink great czech beer, of course
But beware, czech beer is fantastic – you just can’t stop drinking it
)
let’s stay in touch – i’ll write you an email, see you…
great work!! To introduce myself, I am the current calendar maintainer. Am just curious to know where the implemention for the interval tree resides. The File backend maintains all the events as a flat ical file, where as there is also a ECalBackendCache (xml based) which other backends use. The xml based one is also a sequential cache storing all the events as strings which is bad. It would be good have it optimizied and used commonly in all backends to provide better abstraction.