[Users] memory leak in presence module?

Klaus Darilion klaus.mailinglists at pernau.at
Thu May 3 12:34:20 CEST 2007


Hi Daniel!

Summary:
- Without t_release() (no modifications to source code) openser leaks 
memory.
- with t_release() openser does not leak. But after some time there is 
strange behaviour, e.g.:
  -: openser stops reacting for some minutes and afterwards gets
     terminated with signal 9. When openser stops working the load
     increase to > 40. This happend 3 times now.
  -: openser stops reacting for some minutes and the linux PC
     where openser is running gets unresponsive. No login. Open
     SSH sessions are unresponsive. I had to reboot the PC. Happend
     1 time.

Maybe this is not pure openser related, but a problem with openser and 
Linux (as I had to reboot the server one time).

Any hints how to debug this?

regards
klaus

Daniel-Constantin Mierla wrote:
> Hello Klaus,
> 
> I will try to find some ways to investigate the signal 9. As said, 
> except while waiting for writing the mem log, there is no signal 9 to be 
> issued.
> 
> Regarding the tm stuff, I am not sure whether your last answer about the 
> mem leak as being solved is due to t_release() or you tried the second 
> option as well (removing in tm/uac.c the line 224). Can you give a short 
> summary?
> 
> Cheers,
> Daniel
> 
> 
> On 04/30/07 16:34, Klaus Darilion wrote:
>> Hi!
>>
>> I tried again and it happened again:
>>
>> Apr 30 15:00:54 ds3000 /usr/sbin/openser[7648]: 
>> 32b24f15e52d603ba890a9729723c4b0.0167///45-6782 at 83.136.32.132 PUBLISH 
>> detected, handle_publish ...
>> outside t_newtran
>> Apr 30 15:00:54 ds3000 /usr/sbin/openser[7655]: 
>> 32b24f15e52d603ba890a9729723c4b0.7e11///14-6782 at 83.136.32.132 PUBLISH 
>> detected, handle_publish ...
>> outside t_newtran
>> Apr 30 15:00:54 ds3000 /usr/sbin/openser[7648]: 
>> 32b24f15e52d603ba890a9729723c4b0.0167///45-6782 at 83.136.32.132 PUBLISH 
>> detected, handle_publish ...
>> inside t_newtran
>> Apr 30 15:00:54 ds3000 /usr/sbin/openser[7655]: 
>> 32b24f15e52d603ba890a9729723c4b0.7e11///14-6782 at 83.136.32.132 PUBLISH 
>> detected, handle_publish ...
>> inside t_newtran
>> Apr 30 15:01:03 ds3000 /usr/sbin/openser[7644]: child process 7648 
>> exited by a signal 9
>> Apr 30 15:01:08 ds3000 /usr/sbin/openser[7644]: core was not generated
>> Apr 30 15:01:08 ds3000 /usr/sbin/openser[7644]: INFO: terminating due 
>> to SIGCHLD
>> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7657]: INFO: signal 15 received
>> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7657]: Memory status (pkg):
>> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7657]: qm_status (0x8145960):
>> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7657]:  heap size= 1048576
>> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7659]: INFO: signal 15 received
>> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7653]: INFO: signal 15 received
>> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7653]: Memory status (pkg):
>> Apr 30 15:01:14 ds3000 /usr/sbin/openser[7650]: INFO: signal 15 received
>>
>>
>> Any hints how to debug this?
>>
>> regards
>> klaus
>>
>> Daniel-Constantin Mierla wrote:
>>> Hello Klaus,
>>>
>>> On 04/30/07 13:55, Klaus Darilion wrote:
>>>>
>>>>
>>>> Daniel-Constantin Mierla wrote:
>>>>> Hello Klaus,
>>>>>
>>>>> On 04/27/07 09:27, Klaus Darilion wrote:
>>>>>> Hi Daniel!
>>>>>>
>>>>>> I've tried with t_release and it was running fine several hours 
>>>>>> without leaking. But then, unfortunately openser terminated with 
>>>>>> signal 9. I've never seen this before.
>>>>>
>>>>> signal 9 is KILL, it is very strange if it was not issued by a user 
>>>>> or other process.
>>>>>
>>>>> We discovered the issue (tm/uac.c, line 224), ther eis flag that is 
>>>>> kept to see if there was some operation with the transaction, but 
>>>>> in case of handle_publish() that flag is set by TM api when sending 
>>>>> NOTIFY. The patch is trivial, removing a line, but we have to 
>>>>> investigate if there are some other effects -- so it may take a 
>>>>> while. t_release() should solve meanwhile.
>>>>
>>>> Should solve the memory-leak - but the signal 9?
>>> It might be that it took so long to write the mem long at shut down 
>>> and openser killed itself. If it was not due to a openser stop, then 
>>> I am not aware of any case when signal 9 is sent unless issued by user.
>>>
>>> Cheers,
>>> Daniel
>>>
>>>>
>>>> regards
>>>> klaus
>>>>
>>




More information about the Users mailing list