These operations:

  1. Opening Evolution
  2. Browsing through the calendar (going back about 30 months)
  3. Closing Evolution

generated more than 180 live queries:

Browsing cache analysis

Cache size

and caused that heap of EDS process grew by more 2,5 MB on my testing calendar:

Heap

Heap grew by more than 2.5 MB

Notes:

  • Going back by month in Evolution generated 5 live queries (for day, week, work week, month and list view)
  • One can notice, that not all queries were freed, when I closed Evolution. However, I verified, that only live queries generated by evolution-alarm-notify process were not freed. This process stays alive even when Evolution is closed.
  • The first increment of heap size  (from 506K to 8180K) is caused by opening Evolution (opening calendar).
  • The next smaller increments of heap size are due to live queries.
  • The last decrement is caused by closing Evolution.

EDS live queries leak

August 6, 2008

I wanted to take closer look on the memory consumption of  “browsing-through-the-calendar” operation. As I have probably found a bug, I will post about this later.

The bug: when one clicks on the clock applet, live query, that watches the actual month is generated (and calendar window appears). Another click on the clock applet  (to hide the calendar window) – live query IS NOT destroyed (ECalViewListener object is destroyed, but EDataCalView object is not destroyed).  Next click on the calendar window  –  the same live query is generated again (new calendar client was created).

Here is the picture documenting the error : As I was clicking on the clock applet (almost every second), the number of live queries was increasing and they were not destroyed.

To reproduce the bug:

  1. export ORBIT2_DEBUG_STRMAX=1000
  2. export ORBIT2_DEBUG=traces
  3. Click on the clock applet a few times. Watch, that the same live queries are generated.
  4. Open Evolution and create some event in actual month.
  5. Watch CORBA communication – every live queries will be informed, that new object was created by notifyObjectsAdded method

As it seems to me, we are destroying live queries only when orbit connection is disconnected. However, there are so many listeners and views, that I am little bit confused, still.

In my last post, I found out that my parsed calendar.ics file consumes more than 7MB in memory. I decided to look on eds with massif. As I have only this one calendar in Evolution, I expected to get similar results. Here is the massif graph I got:

EDS under massif

EDS under massif

It’s about 13.8 MB. Looking closer on massif logs, I found out, that almost all allocations were done in method  e_cal_backend_file_open. Next investigations :-) led to the finding, that files $HOME/.evolution/calendar/local/system/tasks.ics and $HOME/.evolution/memos/local/system/journal.ics should be blamed for that.

Now, the most interesting thing is the content of those files. Both files contain only time zones. A lot of copies of the same time zone. Is that a feature or a bug ? :-)


slusny@turret:~/.evolution$ grep BEGIN:VTIMEZONE ~/.evolution/tasks/local/system/tasks.ics | wc -l
538
slusny@turret:~/.evolution$ grep "TZID:/softwarestudio.org/Olson_20011030_5/Africa/Algiers" ~/.evolution/tasks/local/system/tasks.ics | wc -l
318
slusny@turret:~/.evolution$ grep BEGIN:VTIMEZONE /home/slusny/.evolution/memos/local/system/journal.ics | wc -l
814

Here is the massif graph, when I removed these two files:

EDS under massif with removed journal.ics and tasks.ics

EDS under massif with removed journal.ics and tasks.ics

Note 1.8.2008: Files journal.ics and tasks.ics are growing only when eds is restarted (either it was killed or segfaulted…).

To reproduce the bug, it is enough to re-run the evolution and activate the calendar view.

How much memory is  consumed by the calendar ? I hope, that this post helps to answer the question.

I will use my favourite big calendar.ics file. Some basic facts about calendar:


slusny@turret:~/gsoc/massif logs$ ls -lh calendar.ics
-rw-r--r-- 1 slusny slusny 813K 2008-07-28 14:15 calendar.ics

slusny@turret:~/gsoc/massif logs$ grep VEVENT calendar.ics | wc -l
2550

Saying this, I consider the calendar as a big one. I prepared simple program, that just opens the calendar and parses the ICS file, similarly to file backend. Valgrind’s massif gave me this graph


MB
7.192^                                                              ,,:. .
|                                                        . . @ @#:: :.
|                                                    . @@: : @ @#:: ::
|                                                  @@: @@: : @ @#:: ::
|                                               ,, @@: @@: : @ @#:: ::@
|                                            . @@@ @@: @@: : @ @#:: ::@
|                                         .. : @@@ @@: @@: : @ @#:: ::@
|                                      .@ :: : @@@ @@: @@: : @ @#:: ::@
|                                    ,@:@ :: : @@@ @@: @@: : @ @#:: ::@
|                                 ,@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@:
|                              ,: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@:
|                           , :@: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@:
|                         @@@ :@: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@:
|                    ., @ @@@ :@: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@:
|                 . ,:@ @ @@@ :@: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@:.
|               .@: @:@ @ @@@ :@: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@::
|            , @:@: @:@ @ @@@ :@: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@::
|         ,:@@ @:@: @:@ @ @@@ :@: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@::
|     , @ @:@@ @:@: @:@ @ @@@ :@: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@::
|    @@ @ @:@@ @:@: @:@ @ @@@ :@: @@ @@:@ :: : @@@ @@: @@: : @ @#:: ::@::
0 +----------------------------------------------------------------------->Mi
0                                                                   287.6

So, the calendar consumes more thatn 7 MB. I ran the program again without valgrind and looked at process
with smem.pl script. This script parses /proc/$pid/smaps into something more useful

VMSIZE: 18136 kb
RSS: 10784 kb total
2864 kb shared
168 kb private clean
7752 kb private dirty

PRIVATE MAPPINGS
vmsize rss clean rss dirty file
7148 kb 0 kb 7148 kb [heap]
40 kb 0 kb 40 kb /usr/local/lib/libORBit-2.so.0.1.0
44 kb 0 kb 40 kb /home/slusny/gsoc/evolution-data-server/calendar/libecal/.lib
88 kb 0 kb 36 kb [stack]

The heap is shown as smaller here (who knows why ?). However, the calendar is quite big, I think. Perhaps iwe should not hold all events (event prehistoric ones) and all event fields in memory all the time :-)

In this post I will describe impact of new more efficient data structure on calendar part of EDS. The new data structure is our famous interval tree. Imagine, that we are maintainers of calendar data and enemy is asking us expensive questions about our data, mostly “give me events in some time range”. In the past, we took naive approach to answer questions. Now, we are doing more wisely.

On the attached picture is shown, how many events need to be scanned to evaluate query. Red color are queries BEFORE optimization – either 1271 or 346 items were scanned (I have two calendars with 1271 and 346 items and before optimization, all events are scanned to answer question). The green line shows, how many items need to be scanned after optimization.

Number of events scanned before and after optimization

Number of events scanned before and after optimization

One can notice, that there is green (optimized) query on the beginning that was doing scan through all events. This is query, in which no subexpression of from *occur-in-range* was
found – it was not question of form “give me all events in some time range”. Therefore, interval tree did not help us.

More results:

  • More numbers – how many took to evaluate query ?

    Comparison of performance - time to evaluate query

    Comparison of performance - time to evaluate query

  • How many items was tested (sum of 423 queries) ? For every tested event we had to generate recurrences.
Number of scanned events

Number of scanned events

Testing interval trees

July 17, 2008

After implementing interval trees, I have started to think about testing them. It would be great to have some test script. To do this, I would need to think about different scenarios, that can happen – different ways of inserting and removing data can lead to different rotations, for example. Of course, I did not want to spend too much time on testing.

Therefore I have written small test program, that inserts random events into interval tree and list, does several searches, deletes some events and  does some searches again. I have found several bugs this way.

But how can I be sure, that all interesting situations were tested ? I have used great gcov and lcov tools. Almost 90% of code was covered by tests, and although the remaining code is not so important, I would like to add more tests in the future. The valgrind did not show me any memory leaks. This is my tip for your  better sleep: test your code.

Interval trees in EDS

July 17, 2008

Interval tree

Interval tree

Finally, I have finished implementation of inteval trees. Less or more, I have followed the book Introduction to Algorithms, 2nd edition. My implementation is based on red-black trees,  too. EDS is often asked for events in some time range (interval). In times before interval trees, we had to go through all events, generate recurrences and test, if the event is in this range. With interval tree, we can do it much faster.

The data in the tree (events in our case) are accessed by interval queries. Underlaying data structure is a red black tree with the low endpoint chosen as a key. To do efficient interval search, each node needs to maintain some additional data – maximum (max) and minimum  (min) value of any interval endpoint stored in the subtree rooted at x. It is needed to update min and max fields when performing rotations, inserting and deleting new data.

Basic operations, like insert, delete and search, are simple, as we already know about red-black trees. They can be done in logarithmic time. The interesting operation is “search all”, which should return all events in given interval. To search for events, we start in the root of the tree and check if we should go left or right – the min and max fields of the left and right son help us to make correct decision. Sometimes it is needed to go in both directions. Therefore, time can be O( min(n, k lg n) ), where k is the number of events in output list and n is the total number of events.

If someone is interested in implementation, here are some differences between my implementation and book:

  • The code in the book is keeping only min field, not max (max field help us to make more effective decisions)
  • We need to store different events with the same interval and search by UID of the event
  • We need to work with open-ended intervals (special value -1)

My gsoc task is to speed up calendar part of e-d-s. These days I have been using  evo calendar extensively. My aim was to identify the most serious bottlenecks by sysprof.

Here are the results of my 5 minute session. The tests were carried out on big calendars, which I received from Federico.

Sysprof analysis of evolution data server

Several things should be noticed:

  • Almost all system time is divided between Evo and Firefox
  • About 30% of system time was spent in Evolution, 15% in e-d-s.
  • I was using only file backend (e-d-s time was spent in e_cal_backend_file_start_query - 14%)

Analysis:

  • Almost all e-d-s time was spent in e_cal_backend_start_query – 14%. This method is called by e-d-s clients (Evolution, clocks…) and does complete sweep  through the calendar. Clients want to get back calendar objects filtered by some expression (sexp).
  • For every calendar object, test against the expression is carried out (e_cal_backend_sexp_match_comp - 11%),  generating all recurrences.  Usually, the evaluation of expression leads to test, if event occurs in some time range (func_occur_in_time_range - 6%). The second very often used function seems to be  e_cal_backend_sexp_func_make_time – 3.9%. Both  these functions sooner or later call function icaltime_as_timet_with_zone, in which is spent 3.60% and 3.13% – summing up to about 6.7% of system time. In fact, these functions are usually called together, because the typical test is “occur-in-time-range? (make-time \”ISO_DATE_STR\”) (make-time \”ISO_DATE_STR\”)”. The expression  make-time “ISO_DATE_STR” leads to calling e_cal_backend_sexp_func_make_time, which converts date from iso string to time_t value. Expression occur-in-time-range leads to calling func_occur_in_time_range.
  • When constructing expression “occur-in-time-range? (make-time \”ISO_DATE_STR\”) (make-time \”ISO_DATE_STR\”)”, time_t value is converted to the string. Later, when evaluating expression, string is converted to time_t (by make-time).

Time spent in converting time formatsTime spent in converting time formats

Conclusions:

  • In e-d-s we spent about 15% of system time, and almost 7% was spent in icaltime_as_timet_with_zone. The function is called to find out, if calendar event occurs in some time range.  There is a lot of time conversion from time_t to string and back.  The rest was spent by sweeping through the calendar objects and generating recurrences.

The story

Evolution data server is process living (unnoticed if you are lucky :-) ) in your Desktop.  Besides other things, it takes care of calendar data. Clients (programs, that need calendar data – Evolution, clock applet…) send questions (called queries) to e-d-s. Typical query looks like “give me all events in some time range”.

At some point during Evolution’s development, developers informally determined that calendar queries to e-d-s were slow.  Nobody took the trouble of actually figuring out why they were slow, so someone added a simple cache instead. Now it’s time to do more optimization.

The task

The task is to find out, if the cache is useful ( taking into account, that it consumes some memory). That is, we have to answer these questions:

  1. Which queries get performed, and how often ?
  2. What is the lifetime of the cached queries?
  3. How long do queries take to be performed?
  4. Is the cache useful ?

Our approach

Together with my great mentor Federico, I have added some logging subsystem to to e-d-s. It logs each calendar query. I have implemented Python script, that can analyze this log and sort queries to various criteria and produce some gnuplot graphs. Following graphs were generated by the script in one of my typical workdays.

1. Which queries get performed and how often ?

First graph shows how  the cache size is changing (green line) during the day. The cache size is the number of different queries stored in cache. Query gets into cache, when some client wants e-d-s to execute (answer) it, and the query is not found in the cache yet. The peaks correspond to massive usage of Evolution (I was browsing through the calendar). After 13:00 I opened the Evolution and closed  it  before 14:00. The next two peaks correspond again to opening/closing Evolution. The red bar shows, how many queries e-d-s answered without executing it – results were already cached. Obviously, this number can be greater than cache size, because some queries are re-executed several times.

Ananalysis of cache

Ok, so we would like to know how many times were individual queries used. Are the queries used uniformly or small part of queries are used again and again ?

There were almost 370 different queries, from which 180 was not reused  while sitting in cache. The mostly used query was reused 9 times.

2. What is the lifetime of the cached queries ?

The ambition of the following graph is to gain insight into queries lifetime. As can be seen from the picture, there were about 320 different queries. Six queries lived more than 10000 secs, majority of them lived about 4000 secs and more than 100 queries lived just for several seconds. As can be noticed from the picture (if it is not too small), about 30 queries (at the rear of the graph) have negative lifetime values – these queries were not destroyed at all. The queries coming from alarm daemon fall into this category.

Liffetime of queries

3. How long do queries take to be performed ?

This is great opportunity to present another graph.

As can be seen from the graph, it depends on backend that we are using. In case of file backend (first two columns) the execution of query is really fast. In the last two columns there are values for google calendar backend. It takes 0.01 secs to execute the query in average and 0.1 sec in worst case.

4. Is the cache useful ?

So what do you think ?  In my opinion the cache is useful, mainly when we expect to work with remote calendars. Perhaps we should ask, if it is worth the memory, that it consumes ( I’m not trying to say that it consumes too much memory, I do not know, but some developers have this opinion ).

Hello friends,

so I’ve been accepted onto the Google Summer of Code 2008 program to work on optimizing calendar part of EDS. The main goal of this blog is to discuss my work (and life) with you. So, feel free to leave any comments.

Interesting thoughts about EDS, that directly apply to my work can be found at blogs of Ross, Philip and Federico.

You can see my commits at my github repository.

My first task is to analyze performance of calendar queries cache. These days I am preparing patches to log every calendar query. If you use calendar with Evolution, I would appreciate, if you could provide me e-d-s logs to analyze. Please, contact me, if you are interested.

As I live in Prague, I am looking forward to meeting at least some of you at Ubuntu Developer Summit next week.

Follow

Get every new post delivered to your Inbox.