Discussion:
[PJSIP 1.1] deadlock on high call volume
Nguyen Van Minh Danh
2009-06-14 15:02:41 UTC
Permalink
I am using PJSIP 1.1 to write an IVR application. The application would dial
a list of outgoing number, play a pre-recorded wav file and ask the user to
press a DTMF to confirm. The application works fine for small number of
concurrent calls (<5), but with a high number of concurrent calls (10 and
above), there is a deadlock after a certain number of calls (the exact
number could vary but is around 500 on average).

In all deadlock situations, the behaviour causing the deadlock is always the
same. Strangely, repeating the behaviour on its own does not replicate the
problem. The steps leading to the deadlock is as follows:

1. The program dials a number, the phone rings and is picked up by the
remote user. The application's on_call_state fires and the call is in the
'CONFIRMED' state, pjsua_call_info.state = PJSIP_INV_STATE_CONFIRMED.
2. My program begins to transmit the WAV file to the above call which was
picked up.
3. Shortly after the WAV playback is done. another on_call_state event takes
places. The state of the call is PJSIP_INV_STATE_EARLY (Session Progress),
which indicates that the remote phone is ringing.
4. A deadlock happens at this stage, and shortly afterwards, any PJSIP
command will fail with "Timed-out trying to acquire PJSUA mutex".

Some analysis using Visual Studio reveals that the deadlock is caused by the
following 3 threads

*****Thread 1 (conference.c)******

PJ_DEF(pj_status_t) pjmedia_conf_remove_port( pjmedia_conf *conf,
unsigned port )
{
.........
pj_mutex_lock(conf->mutex);
........
}

****Calling stack for Thread 1****

pj_mutex_lock(pj_mutex_t * mutex=0x04a692ac) Line 911 + 0xc bytes C
pjmedia_conf_remove_port(pjmedia_conf * conf=0x033229fc, unsigned int
port=3) Line 1069 + 0xc bytes C
pjsua_conf_remove_port(int id=3) Line 1631 + 0x10 bytes C
stop_media_session(int call_id=72) Line 1230 + 0xc bytes C
pjsua_media_channel_deinit(int call_id=72) Line 1271 + 0x9 bytes C
pjsua_call_on_state_changed(pjsip_inv_session * inv=0x05cd3e28,
pjsip_event * e=0x05a1058c) Line 2970 + 0xb bytes C
inv_set_state(pjsip_inv_session * inv=0x05cd3e28, pjsip_inv_state
state=PJSIP_INV_STATE_DISCONNECTED, pjsip_event * e=0x05a1058c) Line 219 +
0x10 bytes C
handle_uac_call_rejection(pjsip_inv_session * inv=0x05cd3e28, pjsip_event
* e=0x05a1058c) Line 3020 + 0xf bytes C
inv_on_state_early(pjsip_inv_session * inv=0x05cd3e28, pjsip_event *
e=0x05a1058c) Line 3256 + 0xd bytes C
mod_inv_on_tsx_state(pjsip_transaction * tsx=0x05e270cc, pjsip_event *
e=0x05a1058c) Line 535 + 0x19 bytes C
pjsip_dlg_on_tsx_state(pjsip_dialog * dlg=0x05cd377c, pjsip_transaction *
tsx=0x05e270cc, pjsip_event * e=0x05a1058c) Line 1907 + 0x1c bytes C
mod_ua_on_tsx_state(pjsip_transaction * tsx=0x05e270cc, pjsip_event *
e=0x05a1058c) Line 178 + 0x11 bytes C
tsx_set_state(pjsip_transaction * tsx=0x05e270cc, pjsip_tsx_state_e
state=PJSIP_TSX_STATE_COMPLETED, pjsip_event_id_e
event_src_type=PJSIP_EVENT_RX_MSG, void * event_src=0x04a6c3ec) Line 1095 +
0x15 bytes C
tsx_on_state_proceeding_uac(pjsip_transaction * tsx=0x05e270cc,
pjsip_event * event=0x05a107f8) Line 2642 + 0x14 bytes C
pjsip_tsx_recv_msg(pjsip_transaction * tsx=0x05e270cc, pjsip_rx_data *
rdata=0x04a6c3ec) Line 1570 + 0x12 bytes C
mod_tsx_layer_on_rx_response(pjsip_rx_data * rdata=0x04a6c3ec) Line 817 +
0xd bytes C
endpt_on_rx_msg(pjsip_endpoint * endpt=0x03327f1c, int status=0,
pjsip_rx_data * rdata=0x04a6c3ec) Line 927 + 0x11 bytes C
pjsip_tpmgr_receive_packet(pjsip_tpmgr * mgr=0x048dc2a8, pjsip_rx_data *
rdata=0x04a6c3ec) Line 1457 + 0x17 bytes C
udp_on_read_complete(pj_ioqueue_key_t * key=0x047aed88,
pj_ioqueue_op_key_t * op_key=0x04a6c3f8, long bytes_read=482) Line 166 +
0x16 bytes C
ioqueue_dispatch_read_event(pj_ioqueue_t * ioqueue=0x047a007c,
pj_ioqueue_key_t * h=0x047aed88) Line 552 + 0x16 bytes C
pj_ioqueue_poll(pj_ioqueue_t * ioqueue=0x047a007c, const pj_time_val *
timeout=0x05a1fccc) Line 764 + 0x17 bytes C
pjsip_endpt_handle_events2(pjsip_endpoint * endpt=0x03327f1c, const
pj_time_val * max_timeout=0x05a1fdc8, unsigned int * p_count=0x05a1fdd8)
Line 719 + 0x10 bytes C
pjsua_handle_events(unsigned int msec_timeout=10) Line 1217 + 0x14 bytes
C
worker_thread(void * arg=0x00000000) Line 519 + 0x7 bytes C
thread_main(void * param=0x04a68aa4) Line 414 + 0x11 bytes C

*****Thread 2 (stream.c)******
static pj_status_t get_frame( pjmedia_port *port, pjmedia_frame *frame)
{
.......
/* Lock jitter buffer mutex first */
pj_mutex_lock( stream->jb_mutex );
.......
}

****Calling stack for Thread 2****

pj_mutex_lock(pj_mutex_t * mutex=0x05ccf534) Line 911 + 0xc bytes C
get_frame(pjmedia_port * port=0x05ccefec, pjmedia_frame *
frame=0x05b1f96c) Line 222 + 0xf bytes C
pjmedia_port_get_frame(pjmedia_port * port=0x05ccefec, pjmedia_frame *
frame=0x05b1f96c) Line 68 + 0x12 bytes C
read_port(pjmedia_conf * conf=0x033229fc, conf_port * cport=0x05cc9f80,
short * frame=0x04a6acf8, unsigned int count=320, pjmedia_frame_type *
type=0x05b1fb20) Line 1363 + 0x10 bytes C
get_frame(pjmedia_port * this_port=0x04a5becc, pjmedia_frame *
frame=0x05b1fd80) Line 1771 + 0x22 bytes C
pjmedia_port_get_frame(pjmedia_port * port=0x04a5becc, pjmedia_frame *
frame=0x05b1fd80) Line 68 + 0x12 bytes C
clock_callback(const pj_timestamp * ts=0x04a6afdc, void *
user_data=0x04a6acdc) Line 181 + 0x10 bytes C
clock_thread(void * arg=0x04a6afc4) Line 255 + 0x18 bytes C
thread_main(void * param=0x04a6b00c) Line 414 + 0x11 bytes C

---------

It seems that thread 1 is waiting on a lock which is locked by thread 2
(conf->mutex->OwningThread points to thread 2's ID) and thread 2 is waiting
on a lock which is locked by thread 3 (stream->jb_mutex->OwningThread points
to thread 3's ID). Thread 3 details is as follows:

*****Thread 3 (jbuf.c)******

jb_framelist_put_at(jb_framelist_t * framelist=0x05ccf574, unsigned int
index=15447332, const void * frame=0x05baccb4, unsigned int frame_size=80,
unsigned int bit_info=3435973836) Line 241 + 0x1c bytes C
pjmedia_jbuf_put_frame2(pjmedia_jbuf * jb=0x05ccf56c, const void *
frame=0x05baccb4, unsigned int frame_size=80, unsigned int
bit_info=3435973836, int frame_seq=15447332, int * discarded=0x05910b2c)
Line 546 + 0x1c bytes C
on_rx_rtp(void * data=0x05ccefe4, void * pkt=0x05bacca8, long
bytes_read=172) Line 1404 + 0x47 bytes C
srtp_rtp_cb(void * user_data=0x05cd2704, void * pkt=0x05bacca8, long
size=172) Line 809 + 0x1f bytes C
on_rx_rtp(pj_ioqueue_key_t * key=0x048f8674, pj_ioqueue_op_key_t *
op_key=0x05bab268, long bytes_read=172) Line 471 + 0x17 bytes C
ioqueue_dispatch_read_event(pj_ioqueue_t * ioqueue=0x048df834,
pj_ioqueue_key_t * h=0x048f8674) Line 552 + 0x16 bytes C
pj_ioqueue_poll(pj_ioqueue_t * ioqueue=0x048df834, const pj_time_val *
timeout=0x0591feac) Line 764 + 0x17 bytes C
worker_proc(void * arg=0x048dd91c) Line 269 + 0x13 bytes C
thread_main(void * param=0x04a4d940) Line 414 + 0x11 bytes C

Unlike thread 1 and 2 which are frozen waiting for the locks, thread 3 is
still running and keeps on printing the following message:

19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
19:06:42.156 Master/sound 257 samples reduced, buf_cnt=656
19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
19:06:42.156 Master/sound 146 samples reduced, buf_cnt=670
19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
19:06:42.156 Master/sound 185 samples reduced, buf_cnt=645
19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
19:06:42.156 Master/sound 164 samples reduced, buf_cnt=641
19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
..........

Thread 3 never stops printing the message and never releases the lock
(stream->jb_mutex). Therefore, a deadlock occurs between thread 1, 2 and
thread 3.

>From what I understand, thread 1 attempts to add a new conference port for
the call which is in 'EARLY' stage. Thread 2 is responsible for maintaining
the audio stream between the conference port. Thread 3 is responsible for
sending/receiving socket data. I could not figured out the exact location
where threads 3 acquire the lock, thus causing the deadlock later on.

Any ideas on how I can fix this problem? Is this a bug with PJSIP, or has it
got anything to do with my code?

Any advice is appreciated. Thanks in advance.
David Clark
2009-06-15 21:30:47 UTC
Permalink
Yea I think I might have been there done that on this issue. Question:
when you do a play/record/conference add/remove do you do it from a
pjsip callback function?
If so I think that is happening is if the conference bring is
processing a frame while the callback happens
the conf mutex is locked which causes the pjsua_ function to lock the
pjsua mutex and then block on the
conf mutex.

Solution: in the callback record data, like ANI, DNIS, conf_slot,
etc. But do the pjsua_ calls in a seperate
thread.

David Clark



At 10:02 AM 6/14/2009, Nguyen Van Minh Danh wrote:
>I am using PJSIP 1.1 to write an IVR application. The application
>would dial a list of outgoing number, play a pre-recorded wav file
>and ask the user to press a DTMF to confirm. The application works
>fine for small number of concurrent calls (<5), but with a high
>number of concurrent calls (10 and above), there is a deadlock after
>a certain number of calls (the exact number could vary but is around
>500 on average).
>
>In all deadlock situations, the behaviour causing the deadlock is
>always the same. Strangely, repeating the behaviour on its own does
>not replicate the problem. The steps leading to the deadlock is as follows:
>
>1. The program dials a number, the phone rings and is picked up by
>the remote user. The application's on_call_state fires and the call
>is in the 'CONFIRMED' state, pjsua_call_info.state = PJSIP_INV_STATE_CONFIRMED.
>2. My program begins to transmit the WAV file to the above call
>which was picked up.
>3. Shortly after the WAV playback is done. another on_call_state
>event takes places. The state of the call is PJSIP_INV_STATE_EARLY
>(Session Progress), which indicates that the remote phone is ringing.
>4. A deadlock happens at this stage, and shortly afterwards, any
>PJSIP command will fail with "Timed-out trying to acquire PJSUA mutex".
>
>Some analysis using Visual Studio reveals that the deadlock is
>caused by the following 3 threads
>
>*****Thread 1 (conference.c)******
>
>PJ_DEF(pj_status_t) pjmedia_conf_remove_port( pjmedia_conf *conf,
> unsigned port )
>{
>.........
> pj_mutex_lock(conf->mutex);
>........
>}
>
>****Calling stack for Thread 1****
>
> pj_mutex_lock(pj_mutex_t * mutex=0x04a692ac) Line 911 + 0xc bytes C
> pjmedia_conf_remove_port(pjmedia_conf * conf=0x033229fc, unsigned
> int port=3) Line 1069 + 0xc bytes C
> pjsua_conf_remove_port(int id=3) Line 1631 + 0x10 bytes C
> stop_media_session(int call_id=72) Line 1230 + 0xc bytes C
> pjsua_media_channel_deinit(int call_id=72) Line 1271 + 0x9 bytes C
> pjsua_call_on_state_changed(pjsip_inv_session * inv=0x05cd3e28,
> pjsip_event * e=0x05a1058c) Line 2970 + 0xb bytes C
> inv_set_state(pjsip_inv_session * inv=0x05cd3e28, pjsip_inv_state
> state=PJSIP_INV_STATE_DISCONNECTED, pjsip_event *
> e=0x05a1058c) Line 219 + 0x10 bytes C
> handle_uac_call_rejection(pjsip_inv_session * inv=0x05cd3e28,
> pjsip_event * e=0x05a1058c) Line 3020 + 0xf bytes C
> inv_on_state_early(pjsip_inv_session * inv=0x05cd3e28,
> pjsip_event * e=0x05a1058c) Line 3256 + 0xd bytes C
> mod_inv_on_tsx_state(pjsip_transaction * tsx=0x05e270cc,
> pjsip_event * e=0x05a1058c) Line 535 + 0x19 bytes C
> pjsip_dlg_on_tsx_state(pjsip_dialog * dlg=0x05cd377c,
> pjsip_transaction * tsx=0x05e270cc, pjsip_event *
> e=0x05a1058c) Line 1907 + 0x1c bytes C
> mod_ua_on_tsx_state(pjsip_transaction * tsx=0x05e270cc,
> pjsip_event * e=0x05a1058c) Line 178 + 0x11 bytes C
> tsx_set_state(pjsip_transaction * tsx=0x05e270cc,
> pjsip_tsx_state_e state=PJSIP_TSX_STATE_COMPLETED, pjsip_event_id_e
> event_src_type=PJSIP_EVENT_RX_MSG, void *
> event_src=0x04a6c3ec) Line 1095 + 0x15 bytes C
> tsx_on_state_proceeding_uac(pjsip_transaction * tsx=0x05e270cc,
> pjsip_event * event=0x05a107f8) Line 2642 + 0x14 bytes C
> pjsip_tsx_recv_msg(pjsip_transaction * tsx=0x05e270cc,
> pjsip_rx_data * rdata=0x04a6c3ec) Line 1570 + 0x12 bytes C
> mod_tsx_layer_on_rx_response(pjsip_rx_data *
> rdata=0x04a6c3ec) Line 817 + 0xd bytes C
> endpt_on_rx_msg(pjsip_endpoint * endpt=0x03327f1c, int status=0,
> pjsip_rx_data * rdata=0x04a6c3ec) Line 927 + 0x11 bytes C
> pjsip_tpmgr_receive_packet(pjsip_tpmgr * mgr=0x048dc2a8,
> pjsip_rx_data * rdata=0x04a6c3ec) Line 1457 + 0x17 bytes C
> udp_on_read_complete(pj_ioqueue_key_t * key=0x047aed88,
> pj_ioqueue_op_key_t * op_key=0x04a6c3f8, long bytes_read=482) Line
> 166 + 0x16 bytes C
> ioqueue_dispatch_read_event(pj_ioqueue_t * ioqueue=0x047a007c,
> pj_ioqueue_key_t * h=0x047aed88) Line 552 + 0x16 bytes C
> pj_ioqueue_poll(pj_ioqueue_t * ioqueue=0x047a007c, const
> pj_time_val * timeout=0x05a1fccc) Line 764 + 0x17 bytes C
> pjsip_endpt_handle_events2(pjsip_endpoint * endpt=0x03327f1c,
> const pj_time_val * max_timeout=0x05a1fdc8, unsigned int *
> p_count=0x05a1fdd8) Line 719 + 0x10 bytes C
> pjsua_handle_events(unsigned int msec_timeout=10) Line 1217 + 0x14 bytes C
> worker_thread(void * arg=0x00000000) Line 519 + 0x7 bytes C
> thread_main(void * param=0x04a68aa4) Line 414 + 0x11 bytes C
>
>*****Thread 2 (stream.c)******
>static pj_status_t get_frame( pjmedia_port *port, pjmedia_frame *frame)
>{
>.......
>/* Lock jitter buffer mutex first */
>pj_mutex_lock( stream->jb_mutex );
>.......
>}
>
>****Calling stack for Thread 2****
>
> pj_mutex_lock(pj_mutex_t * mutex=0x05ccf534) Line 911 + 0xc bytes C
> get_frame(pjmedia_port * port=0x05ccefec, pjmedia_frame *
> frame=0x05b1f96c) Line 222 + 0xf bytes C
> pjmedia_port_get_frame(pjmedia_port * port=0x05ccefec,
> pjmedia_frame * frame=0x05b1f96c) Line 68 + 0x12 bytes C
> read_port(pjmedia_conf * conf=0x033229fc, conf_port *
> cport=0x05cc9f80, short * frame=0x04a6acf8, unsigned int count=320,
> pjmedia_frame_type * type=0x05b1fb20) Line 1363 + 0x10 bytes C
> get_frame(pjmedia_port * this_port=0x04a5becc, pjmedia_frame *
> frame=0x05b1fd80) Line 1771 + 0x22 bytes C
> pjmedia_port_get_frame(pjmedia_port * port=0x04a5becc,
> pjmedia_frame * frame=0x05b1fd80) Line 68 + 0x12 bytes C
> clock_callback(const pj_timestamp * ts=0x04a6afdc, void *
> user_data=0x04a6acdc) Line 181 + 0x10 bytes C
> clock_thread(void * arg=0x04a6afc4) Line 255 + 0x18 bytes C
> thread_main(void * param=0x04a6b00c) Line 414 + 0x11 bytes C
>
>---------
>
>It seems that thread 1 is waiting on a lock which is locked by
>thread 2 (conf->mutex->OwningThread points to thread 2's ID) and
>thread 2 is waiting on a lock which is locked by thread 3
>(stream->jb_mutex->OwningThread points to thread 3's ID). Thread 3
>details is as follows:
>
>*****Thread 3 (jbuf.c)******
>
> jb_framelist_put_at(jb_framelist_t * framelist=0x05ccf574,
> unsigned int index=15447332, const void * frame=0x05baccb4,
> unsigned int frame_size=80, unsigned int bit_info=3435973836) Line
> 241 + 0x1c bytes C
> pjmedia_jbuf_put_frame2(pjmedia_jbuf * jb=0x05ccf56c, const void
> * frame=0x05baccb4, unsigned int frame_size=80, unsigned int
> bit_info=3435973836, int frame_seq=15447332, int *
> discarded=0x05910b2c) Line 546 + 0x1c bytes C
> on_rx_rtp(void * data=0x05ccefe4, void * pkt=0x05bacca8, long
> bytes_read=172) Line 1404 + 0x47 bytes C
> srtp_rtp_cb(void * user_data=0x05cd2704, void * pkt=0x05bacca8,
> long size=172) Line 809 + 0x1f bytes C
> on_rx_rtp(pj_ioqueue_key_t * key=0x048f8674, pj_ioqueue_op_key_t
> * op_key=0x05bab268, long bytes_read=172) Line 471 + 0x17 bytes C
> ioqueue_dispatch_read_event(pj_ioqueue_t * ioqueue=0x048df834,
> pj_ioqueue_key_t * h=0x048f8674) Line 552 + 0x16 bytes C
> pj_ioqueue_poll(pj_ioqueue_t * ioqueue=0x048df834, const
> pj_time_val * timeout=0x0591feac) Line 764 + 0x17 bytes C
> worker_proc(void * arg=0x048dd91c) Line 269 + 0x13 bytes C
> thread_main(void * param=0x04a4d940) Line 414 + 0x11 bytes C
>Unlike thread 1 and 2 which are frozen waiting for the locks, thread
>3 is still running and keeps on printing the following message:
>
> 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
> 19:06:42.156 Master/sound 257 samples reduced, buf_cnt=656
> 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
> 19:06:42.156 Master/sound 146 samples reduced, buf_cnt=670
> 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
> 19:06:42.156 Master/sound 185 samples reduced, buf_cnt=645
> 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
> 19:06:42.156 Master/sound 164 samples reduced, buf_cnt=641
> 19:06:42.156 ec041832F0 Underflow, buf_cnt=0, will generate 1 frame
>..........
>
>Thread 3 never stops printing the message and never releases the
>lock (stream->jb_mutex). Therefore, a deadlock occurs between thread
>1, 2 and thread 3.
>
> From what I understand, thread 1 attempts to add a new conference
> port for the call which is in 'EARLY' stage. Thread 2 is
> responsible for maintaining the audio stream between the conference
> port. Thread 3 is responsible for sending/receiving socket data. I
> could not figured out the exact location where threads 3 acquire
> the lock, thus causing the deadlock later on.
>
>Any ideas on how I can fix this problem? Is this a bug with PJSIP,
>or has it got anything to do with my code?
>
>Any advice is appreciated. Thanks in advance.
>
>
>_______________________________________________
>Visit our blog: http://blog.pjsip.org
>
>pjsip mailing list
>***@lists.pjsip.org
>http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
Nguyen Van Minh Danh
2009-06-17 04:35:05 UTC
Permalink
Hi David

Thanks for the suggestions. I was doing the playing/recording/conference
add/remove port inside the on_call_state callback event. After some
research, I think the issue I mentioned has got to do with this

[pjsip] Infinite loop in jbuf causing deadlock,
http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/2009-March/006761.htm
l

There was a bug in jbuf.c (v1.1) which resulted in an infinite loop. The
PJSIP release log shows that the issue has been fixed in PJSIP 1.2. So I
upgraded to 1.2 (I was using 1.1), did not change my code at all, and the
issue doesn't seem to appear anymore.

Nevertheless, you have a valid point that a deadlock may happen when calling
pjsua_ calls from a PJSIP callback event. Previously I observed deadlocks
when calling pjsua_call_makecall and pjsua_call_hangup, which could only be
resolved by calling these APIs from a separate thread but I did not know it
had to do with the callback event.

As an experiment, I run PJSIP 1.1 with code modified to do the pjsua_ calls
from a separate thread (and not in the PJSIP callback function) and the
issue also does not appear again.

Thanks
logan
2009-06-17 06:06:39 UTC
Permalink
>Nevertheless, you have a valid point that a deadlock may happen when calling pjsua_ calls from a PJSIP callback event.

On Windows the mutex semantics are recursive in nature, so a thread that acquired a mutex can re-acquire it multiple times without entering a deadlock (assuming here that the PJSIP callback is from the same thread that holds the lock). In general I haven't had issues in calling pjsua_* API's from the callbacks on both Windows and Symbian.

Cheers,
Hitesh
----- Original Message -----
From: Nguyen Van Minh Danh
To: ***@lists.pjsip.org
Sent: Tuesday, June 16, 2009 9:35 PM
Subject: Re: [pjsip] [PJSIP 1.1] deadlock on high call volume


Hi David

Thanks for the suggestions. I was doing the playing/recording/conference add/remove port inside the on_call_state callback event. After some research, I think the issue I mentioned has got to do with this

[pjsip] Infinite loop in jbuf causing deadlock, http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/2009-March/006761.html

There was a bug in jbuf.c (v1.1) which resulted in an infinite loop. The PJSIP release log shows that the issue has been fixed in PJSIP 1.2. So I upgraded to 1.2 (I was using 1.1), did not change my code at all, and the issue doesn't seem to appear anymore.

Nevertheless, you have a valid point that a deadlock may happen when calling pjsua_ calls from a PJSIP callback event. Previously I observed deadlocks when calling pjsua_call_makecall and pjsua_call_hangup, which could only be resolved by calling these APIs from a separate thread but I did not know it had to do with the callback event.

As an experiment, I run PJSIP 1.1 with code modified to do the pjsua_ calls from a separate thread (and not in the PJSIP callback function) and the issue also does not appear again.

Thanks


------------------------------------------------------------------------------


_______________________________________________
Visit our blog: http://blog.pjsip.org

pjsip mailing list
***@lists.pjsip.org
http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
Manoj Joshi
2009-06-17 07:12:31 UTC
Permalink
Hello Guys,

I am facing a wierd issue related to registration with one of the SIP
Registrar. With this Registrar, Pjsip keeps on sending registration every
second. I have read the previous issues related to malformed contact header
and tried to change some values but it never stops sending registration.
Here is the SIP messages i captured at server.

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrntA
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0

SIP/2.0 100 Trying
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrn
tA
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Server: DIGITALK Multiservice Platform rev.239
Content-Length: 0


SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrn
tA
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>;tag=ba08b629fs
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="6f1fff64c9a9619e2a4587b705b3fe6a5ae84778",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrntA
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0

SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrn
tA
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>;tag=ba08b629fs
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="6f1fff64c9a9619e2a4587b705b3fe6a5ae84778",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjt0IEQVfsVEAQVZrmqeAiKQlQv4ekffWg
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38213 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Authorization: Digest username="500100467", realm="sip.xyz.com",
nonce="6f1fff64c9a9619e2a4587b705b3fe6a5ae84778", uri="sip:sip.xyz.com",
response="ef7c42c065e7ec56ccaa5f6e9c74b913", algorithm=MD5
Content-Length: 0


SIP/2.0 200 OK
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjt0IEQVfsVEAQVZrmqeAiKQlQv4ekff
Wg
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>;tag=bbe22c93ds
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38213 REGISTER
Contact: <sip:***@114.31.177.96:5060>;expires=3383;q=1.0
Server: DIGITALK Multiservice Platform rev.239
Expires: 3383
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQFsW
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0

SIP/2.0 100 Trying
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQF
sW
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Server: DIGITALK Multiservice Platform rev.239
Content-Length: 0

SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQF
sW
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>;tag=b81dec22es
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="b09fd8072b58a2e3fef032d18e2431b61b4cff0b",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQFsW
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0


SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQF
sW
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>;tag=b81dec22es
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="b09fd8072b58a2e3fef032d18e2431b61b4cff0b",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjT1O4N33LqqhGYp-HG9xlkEO5B-3hhPQ6
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38215 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Authorization: Digest username="500100467", realm="sip.xyz.com",
nonce="b09fd8072b58a2e3fef032d18e2431b61b4cff0b", uri="sip:sip.xyz.com",
response="a77b5fb5f65036faee7358f1aa38db3f", algorithm=MD5
Content-Length: 0

SIP/2.0 200 OK
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjT1O4N33LqqhGYp-HG9xlkEO5B-3hhP
Q6
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>;tag=b17feaffas
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38215 REGISTER
Contact: <sip:***@114.31.177.96:5060>;expires=3576;q=1.0
Server: DIGITALK Multiservice Platform rev.239
Expires: 3576
Content-Length: 0

-----------------------------

--------------------------------

..and so on.....

Regards,

Manoj
Manoj Joshi
2009-06-18 06:05:17 UTC
Permalink
Hello Benny,

I saw that someone else faced this issue in past and you asked for SIP
trace. After that there was no message exchanged. I have sent the SIP trace
can you please comment?

Regards,

Manoj
-----Original Message-----
From: pjsip-***@lists.pjsip.org
[mailto:pjsip-***@lists.pjsip.org]On Behalf Of Manoj Joshi
Sent: Wednesday, June 17, 2009 12:43 PM
To: pjsip list
Subject: [pjsip] PJSIP Symbian_uaGUI sends registration every second


Hello Guys,

I am facing a wierd issue related to registration with one of the SIP
Registrar. With this Registrar, Pjsip keeps on sending registration every
second. I have read the previous issues related to malformed contact header
and tried to change some values but it never stops sending registration.
Here is the SIP messages i captured at server.

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrntA
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0

SIP/2.0 100 Trying
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrn
tA
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Server: DIGITALK Multiservice Platform rev.239
Content-Length: 0


SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrn
tA
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>;tag=ba08b629fs
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="6f1fff64c9a9619e2a4587b705b3fe6a5ae84778",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrntA
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0

SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrn
tA
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>;tag=ba08b629fs
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="6f1fff64c9a9619e2a4587b705b3fe6a5ae84778",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjt0IEQVfsVEAQVZrmqeAiKQlQv4ekffWg
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38213 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Authorization: Digest username="500100467", realm="sip.xyz.com",
nonce="6f1fff64c9a9619e2a4587b705b3fe6a5ae84778", uri="sip:sip.xyz.com",
response="ef7c42c065e7ec56ccaa5f6e9c74b913", algorithm=MD5
Content-Length: 0


SIP/2.0 200 OK
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjt0IEQVfsVEAQVZrmqeAiKQlQv4ekff
Wg
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>;tag=bbe22c93ds
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38213 REGISTER
Contact: <sip:***@114.31.177.96:5060>;expires=3383;q=1.0
Server: DIGITALK Multiservice Platform rev.239
Expires: 3383
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQFsW
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0

SIP/2.0 100 Trying
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQF
sW
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Server: DIGITALK Multiservice Platform rev.239
Content-Length: 0

SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQF
sW
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>;tag=b81dec22es
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="b09fd8072b58a2e3fef032d18e2431b61b4cff0b",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQFsW
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0


SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQF
sW
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>;tag=b81dec22es
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="b09fd8072b58a2e3fef032d18e2431b61b4cff0b",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjT1O4N33LqqhGYp-HG9xlkEO5B-3hhPQ6
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38215 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Authorization: Digest username="500100467", realm="sip.xyz.com",
nonce="b09fd8072b58a2e3fef032d18e2431b61b4cff0b", uri="sip:sip.xyz.com",
response="a77b5fb5f65036faee7358f1aa38db3f", algorithm=MD5
Content-Length: 0

SIP/2.0 200 OK
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjT1O4N33LqqhGYp-HG9xlkEO5B-3hhP
Q6
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>;tag=b17feaffas
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38215 REGISTER
Contact: <sip:***@114.31.177.96:5060>;expires=3576;q=1.0
Server: DIGITALK Multiservice Platform rev.239
Expires: 3576
Content-Length: 0

-----------------------------

--------------------------------

..and so on.....

Regards,

Manoj
David Clark
2009-06-19 18:48:06 UTC
Permalink
Yea even with 1.2 I still have pjsua_call_hangup() and
pjsua_call_get_info() returning 70009 after like a 3 minute delay.
I am back to a mutex debug enabled hammer test. Anybody have any
ideas? The pjsua_call_hangup() and pjsua_call_get_info()
are not in a callback function. The pjsua_call_get_info() is after a
make_call and is optional. I am only doing it to get the call_id string
the sip provider sends for tracking. Or in this case pjsip makes up.

David Clark


At 11:35 PM 6/16/2009, Nguyen Van Minh Danh wrote:
>Hi David
>
>Thanks for the suggestions. I was doing the
>playing/recording/conference add/remove port inside the
>on_call_state callback event. After some research, I think the issue
>I mentioned has got to do with this
>
>[pjsip] Infinite loop in jbuf causing deadlock,
><http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/2009-March/006761.html>http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/2009-March/006761.html
>
>There was a bug in jbuf.c (v1.1) which resulted in an infinite loop.
>The PJSIP release log shows that the issue has been fixed in PJSIP
>1.2. So I upgraded to 1.2 (I was using 1.1), did not change my code
>at all, and the issue doesn't seem to appear anymore.
>
>Nevertheless, you have a valid point that a deadlock may happen when
>calling pjsua_ calls from a PJSIP callback event. Previously I
>observed deadlocks when calling pjsua_call_makecall and
>pjsua_call_hangup, which could only be resolved by calling these
>APIs from a separate thread but I did not know it had to do with the
>callback event.
>
>As an experiment, I run PJSIP 1.1 with code modified to do the
>pjsua_ calls from a separate thread (and not in the PJSIP callback
>function) and the issue also does not appear again.
>
>Thanks
>_______________________________________________
>Visit our blog: http://blog.pjsip.org
>
>pjsip mailing list
>***@lists.pjsip.org
>http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
Joel Dodson
2009-06-19 19:48:05 UTC
Permalink
Hi David,

when you say you're not in a call back function, you mean the pjsua_*
calls you are making are not done from a PJSIP callback function? Or
you mean the deadlock is not in a callback function?

I'm not using PJSUA. I'm using the lower level APIs, but I ran into a
deadlock issue that would also be an issue for PJSUA. The problem was
interaction between my locking mechanism and PJSIPs. I had locks on
resources that I acquire when working on the resource. When the
resource would call some PJSIP functions, an internal lock was
acquired (one I wasn't aware of at the time). That same internal lock
is acquired in some cases before my callbacks are called.

I had a case where one thread had my application layer resource lock
and called a PJSIP function which acquired the dialog lock. On
another thread (a PJSIP thread) the dialog lock had already been
acquired and that thread was calling one of my callbacks. My callback
was trying to acquire my application layer lock. Boom. Deadlock.

Now I always acquire the PJSIP dialog lock before trying to acquire my
application layer resource lock.

Hope this helps.

Joel

On Fri, Jun 19, 2009 at 11:48 AM, David Clark<***@tx.rr.com> wrote:
> Yea even with 1.2 I still have pjsua_call_hangup() and pjsua_call_get_info()
> returning 70009 after like a 3 minute delay.
> I am back to a mutex debug enabled hammer test.  Anybody have any ideas?
> The pjsua_call_hangup() and pjsua_call_get_info()
> are not in a callback function.  The pjsua_call_get_info() is after a
> make_call and is optional.  I am only doing it to get the call_id string
> the sip provider sends for tracking.  Or in this case pjsip makes up.
>
> David Clark
>
>
> At 11:35 PM 6/16/2009, Nguyen Van Minh Danh wrote:
>
> Hi David
>
> Thanks for the suggestions. I was doing the playing/recording/conference
> add/remove port inside the on_call_state callback event. After some
> research, I think the issue I mentioned has got to do with this
>
> [pjsip] Infinite loop in jbuf causing deadlock,
> http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/2009-March/006761.html
>
> There was a bug in jbuf.c (v1.1) which resulted in an infinite loop. The
> PJSIP release log shows that the issue has been fixed in PJSIP 1.2. So I
> upgraded to 1.2 (I was using 1.1), did not change my code at all, and the
> issue doesn't seem to appear anymore.
>
> Nevertheless, you have a valid point that a deadlock may happen when calling
> pjsua_ calls from a PJSIP callback event. Previously I observed deadlocks
> when calling pjsua_call_makecall and pjsua_call_hangup, which could only be
> resolved by calling these APIs from a separate thread but I did not know it
> had to do with the callback event.
>
> As an experiment, I run PJSIP 1.1 with code modified to do the pjsua_ calls
> from a separate thread (and not in the PJSIP callback function) and the
> issue also does not appear again.
>
> Thanks
> _______________________________________________
> Visit our blog: http://blog.pjsip.org
>
> pjsip mailing list
> ***@lists.pjsip.org
> http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
>
> _______________________________________________
> Visit our blog: http://blog.pjsip.org
>
> pjsip mailing list
> ***@lists.pjsip.org
> http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
>
>
David Clark
2009-06-20 02:49:01 UTC
Permalink
At 02:48 PM 6/19/2009, Joel Dodson wrote:
>Hi David,
>
>when you say you're not in a call back function, you mean the pjsua_*
>calls you are making are not done from a PJSIP callback function? Or
>you mean the deadlock is not in a callback function?

When I say callback I mean, on_call_state() on_incoming_call(),
on_playwave_eof()
those type of things. If you look at pjsua.exe you will see my model
for doing most
pjsip stuff. I do have some application locks. But those are much
simpler than what
is going on in pjsip. They are mostly for doing things like, lock a
mutex, add a item
to a list, unlock that same mutex. There is no concept of trylock()
in my application.

So far what I have seen is locking between threads with the conf
mutex and the pjsua
mutex. Basically if you are in conference.c get_frame and lock the
conf mutex, a
pjsua callback had better not come to you. For the first pjsua_ call
is going to break
things.

But what this 70009 is I don't know. But I will look at the mutex
debug listing of it
next.


>I'm not using PJSUA. I'm using the lower level APIs, but I ran into a
>deadlock issue that would also be an issue for PJSUA. The problem was
>interaction between my locking mechanism and PJSIPs. I had locks on
>resources that I acquire when working on the resource. When the
>resource would call some PJSIP functions, an internal lock was
>acquired (one I wasn't aware of at the time). That same internal lock
>is acquired in some cases before my callbacks are called.
>
>I had a case where one thread had my application layer resource lock
>and called a PJSIP function which acquired the dialog lock. On
>another thread (a PJSIP thread) the dialog lock had already been
>acquired and that thread was calling one of my callbacks. My callback
>was trying to acquire my application layer lock. Boom. Deadlock.
>
>Now I always acquire the PJSIP dialog lock before trying to acquire my
>application layer resource lock.

Ok...acquire the PJSIP dialog lock...my stuff is higher level than that. Which
might be part of my problem. I execute pjsua_ functions but I never
specifically
get a pjsip lock. The pjsua_ functions do that for me. But I never
found a good
example of doing say audio playback or recording without using the pjsua
functions. Anybody know of a good example that does that? My application
is a IVR application with heavy audio playback, record, conference
bridging, and
get digits functionality. But if I didn't use pjsua_ functions I
could break things out
among multiple conference bridges which would be a good thing.

Thanks,
David Clark


>Hope this helps.
>
>Joel
>
>On Fri, Jun 19, 2009 at 11:48 AM, David Clark<***@tx.rr.com> wrote:
> > Yea even with 1.2 I still have pjsua_call_hangup() and
> pjsua_call_get_info()
> > returning 70009 after like a 3 minute delay.
> > I am back to a mutex debug enabled hammer test. Anybody have any ideas?
> > The pjsua_call_hangup() and pjsua_call_get_info()
> > are not in a callback function. The pjsua_call_get_info() is after a
> > make_call and is optional. I am only doing it to get the call_id string
> > the sip provider sends for tracking. Or in this case pjsip makes up.
> >
> > David Clark
> >
> >
> > At 11:35 PM 6/16/2009, Nguyen Van Minh Danh wrote:
> >
> > Hi David
> >
> > Thanks for the suggestions. I was doing the playing/recording/conference
> > add/remove port inside the on_call_state callback event. After some
> > research, I think the issue I mentioned has got to do with this
> >
> > [pjsip] Infinite loop in jbuf causing deadlock,
> >
> http://lists.pjsip.org/pipermail/pjsip_lists.pjsip.org/2009-March/006761.html
> >
> > There was a bug in jbuf.c (v1.1) which resulted in an infinite loop. The
> > PJSIP release log shows that the issue has been fixed in PJSIP 1.2. So I
> > upgraded to 1.2 (I was using 1.1), did not change my code at all, and the
> > issue doesn't seem to appear anymore.
> >
> > Nevertheless, you have a valid point that a deadlock may happen
> when calling
> > pjsua_ calls from a PJSIP callback event. Previously I observed deadlocks
> > when calling pjsua_call_makecall and pjsua_call_hangup, which could only be
> > resolved by calling these APIs from a separate thread but I did not know it
> > had to do with the callback event.
> >
> > As an experiment, I run PJSIP 1.1 with code modified to do the pjsua_ calls
> > from a separate thread (and not in the PJSIP callback function) and the
> > issue also does not appear again.
> >
> > Thanks
> > _______________________________________________
> > Visit our blog: http://blog.pjsip.org
> >
> > pjsip mailing list
> > ***@lists.pjsip.org
> > http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
> >
> > _______________________________________________
> > Visit our blog: http://blog.pjsip.org
> >
> > pjsip mailing list
> > ***@lists.pjsip.org
> > http://lists.pjsip.org/mailman/listinfo/pjsip_lists.pjsip.org
> >
> >
m***@ascenttelecom.com
2009-06-18 20:37:36 UTC
Permalink
Any Comment?

Few things i noticed in debug.

1- According to last update of the Authentication algorithm, Expiry is
being calculated correctly I think.

2- Moreover according to SIP Trace i don't see UR change in Contact as this
was registered from public IP.

3- There is no register sent with Expiry =0 (Unregister)

I do not understand why PJSIP is sending register again and again. I put a
debug on tsx_callback function and i see that its being called again and
again with 200 OK event. Where can i check what exactly is making pjsip
send register again and again?

Regards,

Manoj
-----Original Message-----
From: pjsip-***@lists.pjsip.org
[mailto:pjsip-***@lists.pjsip.org]On Behalf Of Manoj Joshi
Sent: Thursday, June 18, 2009 11:35 AM
To: pjsip list
Subject: Re: [pjsip] PJSIP Symbian_uaGUI sends registration every second


Hello Benny,

I saw that someone else faced this issue in past and you asked for SIP
trace. After that there was no message exchanged. I have sent the SIP trace
can you please comment?

Regards,

Manoj
-----Original Message-----
From: pjsip-***@lists.pjsip.org
[mailto:pjsip-***@lists.pjsip.org]On Behalf Of Manoj Joshi
Sent: Wednesday, June 17, 2009 12:43 PM
To: pjsip list
Subject: [pjsip] PJSIP Symbian_uaGUI sends registration every second


Hello Guys,

I am facing a wierd issue related to registration with one of the SIP
Registrar. With this Registrar, Pjsip keeps on sending registration every
second. I have read the previous issues related to malformed contact header
and tried to change some values but it never stops sending registration.
Here is the SIP messages i captured at server.

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrntA
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0

SIP/2.0 100 Trying
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrn
tA
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Server: DIGITALK Multiservice Platform rev.239
Content-Length: 0


SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrn
tA
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>;tag=ba08b629fs
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="6f1fff64c9a9619e2a4587b705b3fe6a5ae84778",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrntA
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0

SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjGq.xVIF.AHgPggYpEhf3g-OBteLIrn
tA
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>;tag=ba08b629fs
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38212 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="6f1fff64c9a9619e2a4587b705b3fe6a5ae84778",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjt0IEQVfsVEAQVZrmqeAiKQlQv4ekffWg
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38213 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Authorization: Digest username="500100467", realm="sip.xyz.com",
nonce="6f1fff64c9a9619e2a4587b705b3fe6a5ae84778", uri="sip:sip.xyz.com",
response="ef7c42c065e7ec56ccaa5f6e9c74b913", algorithm=MD5
Content-Length: 0


SIP/2.0 200 OK
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjt0IEQVfsVEAQVZrmqeAiKQlQv4ekff
Wg
From: <sip:***@sip.xyz.com>;tag=wohyAXwWxz-K5fp9.j33CHcMkePEQf2u
To: <sip:***@sip.xyz.com>;tag=bbe22c93ds
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38213 REGISTER
Contact: <sip:***@114.31.177.96:5060>;expires=3383;q=1.0
Server: DIGITALK Multiservice Platform rev.239
Expires: 3383
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQFsW
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0

SIP/2.0 100 Trying
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQF
sW
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Server: DIGITALK Multiservice Platform rev.239
Content-Length: 0

SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQF
sW
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>;tag=b81dec22es
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="b09fd8072b58a2e3fef032d18e2431b61b4cff0b",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQFsW
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Content-Length: 0


SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjB3YvQhtuL2DRO6skFLJnSNGyOLTOQF
sW
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>;tag=b81dec22es
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38214 REGISTER
Server: DIGITALK Multiservice Platform rev.239
WWW-Authenticate: Digest
realm="sip.xyz.com",nonce="b09fd8072b58a2e3fef032d18e2431b61b4cff0b",algorit
hm=MD5
Content-Length: 0

REGISTER sip:sip.xyz.com SIP/2.0
Via: SIP/2.0/UDP
114.31.177.96:5060;rport;branch=z9hG4bKPjT1O4N33LqqhGYp-HG9xlkEO5B-3hhPQ6
Max-Forwards: 70
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38215 REGISTER
Contact: <sip:***@114.31.177.96:5060>
Expires: 3600
Authorization: Digest username="500100467", realm="sip.xyz.com",
nonce="b09fd8072b58a2e3fef032d18e2431b61b4cff0b", uri="sip:sip.xyz.com",
response="a77b5fb5f65036faee7358f1aa38db3f", algorithm=MD5
Content-Length: 0

SIP/2.0 200 OK
Via: SIP/2.0/UDP
114.31.177.96:5060;rport=5060;branch=z9hG4bKPjT1O4N33LqqhGYp-HG9xlkEO5B-3hhP
Q6
From: <sip:***@sip.xyz.com>;tag=fdHdxyFt2MfYfBTVdeNs5uqA60bd44yt
To: <sip:***@sip.xyz.com>;tag=b17feaffas
Call-ID: lmn9zffkzpx3nxBXF8ef7wqRtLxqHet2
CSeq: 38215 REGISTER
Contact: <sip:***@114.31.177.96:5060>;expires=3576;q=1.0
Server: DIGITALK Multiservice Platform rev.239
Expires: 3576
Content-Length: 0

-----------------------------

--------------------------------

..and so on.....

Regards,

Manoj











--------------------------------------------------------------------
mail2web.com – What can On Demand Business Solutions do for you?
http://link.mail2web.com/Business/SharePoint
mayamatakeshi
2009-06-19 01:57:09 UTC
Permalink
On Fri, Jun 19, 2009 at 5:37 AM,
***@ascenttelecom.com<***@ascenttelecom.com> wrote:
> Any Comment?
>
> Few things i noticed in debug.
>
> 1- According to last update of the Authentication algorithm, Expiry is
> being calculated correctly I think.
>
> 2- Moreover according to SIP Trace i don't see UR change in Contact as this
> was registered from public IP.
>
> 3- There is no register sent with Expiry =0 (Unregister)
>
> I do not understand why PJSIP is sending register again and again. I put a
> debug on tsx_callback function and i see that its being called again and
> again with 200 OK event. Where can i check what exactly is making pjsip
> send register again and again?

Hello,
I have not dug into PJSIP code that much, but if I was having this
problem, I would try to debug at the point the timer for
re-registration is scheduled. That would be some call to
pj_timer_heap_schedule or pjsip_endpt_schedule_timer somewhere after
reception of the final response.
m***@ascenttelecom.com
2009-06-19 05:19:12 UTC
Permalink
Yes this function pjsip_endpt_schedule_timer( regc->endpt, &regc->timer,
&delay); is being called from tsx_callback upon every message received. But
the issue is this function is being called again and again. I can see that
expiration is being set 3600 but callback function is called again after a
second and this time it sets different expiration values.

That means despite setting expiration somthing is causing pjsip to send
register again. Since this is happening only with this SIP Registrar, i
sent SIP traces to everyone.

Regards,

Manoj

-----Original Message-----
From: mayamatakeshi [mailto:***@gmail.com]
Sent: Friday, June 19, 2009 7:27 AM
To: ***@ascenttelecom.com; pjsip list
Subject: Re: [pjsip] PJSIP Symbian_uaGUI sends registration every second


On Fri, Jun 19, 2009 at 5:37 AM,
***@ascenttelecom.com<***@ascenttelecom.com> wrote:
> Any Comment?
>
> Few things i noticed in debug.
>
> 1- According to last update of the Authentication algorithm, Expiry is
> being calculated correctly I think.
>
> 2- Moreover according to SIP Trace i don't see UR change in Contact as
this
> was registered from public IP.
>
> 3- There is no register sent with Expiry =0 (Unregister)
>
> I do not understand why PJSIP is sending register again and again. I put a
> debug on tsx_callback function and i see that its being called again and
> again with 200 OK event. Where can i check what exactly is making pjsip
> send register again and again?

Hello,
I have not dug into PJSIP code that much, but if I was having this
problem, I would try to debug at the point the timer for
re-registration is scheduled. That would be some call to
pj_timer_heap_schedule or pjsip_endpt_schedule_timer somewhere after
reception of the final response.

--------------------------------------------------------------------
mail2web.com – What can On Demand Business Solutions do for you?
http://link.mail2web.com/Business/SharePoint
mayamatakeshi
2009-06-19 06:14:28 UTC
Permalink
On Fri, Jun 19, 2009 at 2:19 PM,
***@ascenttelecom.com<***@ascenttelecom.com> wrote:
> Yes this function pjsip_endpt_schedule_timer( regc->endpt, &regc->timer,
> &delay); is being called from tsx_callback upon every message received. But
> the issue is this function is being called again and again. I can see that
> expiration is being set 3600 but callback function is called again after a
> second and this time it sets different expiration values.
>
> That means despite setting expiration somthing is causing pjsip to send
> register again. Since this is happening only with this SIP Registrar, i
> sent SIP traces to everyone.

Well,
fortunately the behavior is reproducible.
I would compare the messages with the ones used with the other
Registrars that don't show this problem.
If no significant difference can be perceived, I would write a SIPp
uas scenario based on the messages exchanged.
This scenario is supposed to be able to recreate the problem.
Then I would modify its contents little-by-little until the problem
disappears so that I could identify what thing could be triggering the
problem (it seems like a pjsip bug).
Continue reading on narkive:
Loading...