-
Notifications
You must be signed in to change notification settings - Fork 390
LIBUV: Crashes while closing handles #79
Comments
Also, I am new to github and not sure how to attach files. Please advise. |
Hi Folks: With limited testing the problem ceases to happen if you force uv_run() in the IO_Task()
#ifdef CLOSE_KLUDGE2 Best Regards, Paul R. |
Hi Folks: I think I see the heart of my problem. Everything appears to work correctly The comments about coalescing of uv_async_send() calls in the documentation is somewhat misleading. When you attempt another incoming connection the following things occur. Notice in 2.2, below that uv_start_loop() executes without being called. This
2.1) A poll handle is successfully allocated in the IO_Trigger_Task() thread.
2.2) uv_poll_start() is invoked via a call.
2.3) uv_poll_start() executes again without being called ! This is what you see in GDB which is very strange since I know there is only Breakpoint 1, IO_Trigger_Task (arg=0x0) at network_io.c:212 This is the relevant code of the IO_Trigger_Task() thread.
2.4) The polling callback function never executes. NOTE: The polling loop, Poll_Loop, of type uv_loop_t is already running and was started,
This is the sequence of operations used to free the first connection.
1.1) Wait for poll handle to be freed and then release the async. handle.
2.1) Wait for the connect handle to be free and then release the async. handle.
This is the code of the proxy callback routines and the close callback routine. //
ENTER_MUTEX(&Service_Q_Mutex);
} //
} //
} Best Regards, Paul R. |
Hi Folks: This is what happens at a lower level. Do you have insight about the cause ? The main() process and IO_Task() are executing concurrently. NOTE: The IO_Trigger_Task() has been eliminated and the uv_start_poll() call is invoked in the IO_Task()
static void uv__finish_close(uv_handle_t* handle) {
This is the code that executes in the IO_Task() from network_io.c//
} This is the code from unix/poll.c: Line 92 is the call to uv__poll_stop()int uv_poll_start(uv_poll_t* handle, int pevents, uv_poll_cb poll_cb) { assert((pevents & ~(UV_READABLE | UV_WRITABLE)) == 0); uv__poll_stop(handle); if (pevents == 0) events = 0; uv__io_start(handle->loop, &handle->io_watcher, events); return 0; main() Stack Trace#0 0x00007ffff751e267 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55 IO_Task() Stack Trace#0 uv__poll_stop (handle=0x831bc0) at src/unix/poll.c:73 Best Regards, Paul R. |
Hi Folks: After taking several wrong paths I isolated the problem. It occurs when uv_close() is called as follows to close the async. handle
Now if another async. operation is initiated with the same handle, #0 0x00007f256fdf0267 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55 There are two obvious possible causes.
This problem stops occurring if you don't call uv_close(). This is not a problem in my The Libuv documentation states that a uv_async_t handle remains activated until uv_close() If this is a Libuv problem, perhaps dealing with it may be better to deal with it at the It appears that caolescing of uv_async_send() calls is still an issue. Best Regards, Paul R. PS: The reason I am considering using Libuv is that it provides a convenient epoll() wrapper |
Hi Folks: A .tar file containing my prototype Server code is attached to this note. Under moderate testing,
Note that LIVUV_RELEASE is defined, identifies the relevant code sections, and the following
What I need to do is have one task, A, prompt another task, B, to perform an operation Could you point me in the right direction ? I think the procedure described below This is a logical description of what occurs in my Server. For purposes of simplicity, Task A: A task which can receive messages via a condition variable and input queue with uv_cond_wait(). Task B: A loop oriented task that is executing uv_run() Then we perform the following operations in order to make task B perform the desired operation.
Currently this is done with uv_async_send() after the proxy routine to
The proxy routine executes in task B and performs the operation.
Upon completion of the operation Task B sends a notification message to task A Best Regards, Paul R. /tmp/code.tar |
Hi Folks: A version of ASYNC_WAIT() with the right logic follows. As I'm sure you guessed, My basic strategy for managing the uv_async_send() "channel" is to treat it //
ENTER_MUTEX(&Async_Mutex);
} Best Regards, Paul R. |
This is not a repository for libuv. If you are reporting a bug in libuv, please do so at https://github.com/libuv/libuv. It is quite possible that the version of libuv included in the book has had several bugs fixed since this is book is really old. |
Hi Folks:
My Libuv based Server performs all its functions correctly except for TCP connection termination.
Each TCP connection has uv_tcp_t connection handle and uv_poll_t handle whose allocation
and operation are explained below. When the Protocol_Task() thread needs to terminate
a connection, it must stop polling, terminate the TCP socket connection, and deallocate
the handles.
NOTE: I am using the GIT HUB distribution from the following link on Ubuntu Linux version 15.04.
The Libuv software package look like version 1.3.0.
I have had to take extraordinary measures to make connection release reliable.
The relevant code is included at near end of this message and the extraordinary
measures are in the CLOSE_KLUDGE sections. The difficulty arises because the
Libuv loops are not used in the Protocol_Task() yet it must affect operations
on those loops to release handles. It would be nice if Libuv included an API
for releasing handles reliably which could be called from any task.
Connection release still fails about 15% of the time in which case a crash occurs
and the following diagnostic is displayed.
More diagnostic information follows. Do you know what causes this crash ?
I strongly suspect using Linux recv() to read data is not optimal when epoll() is
being used. My understanding is that there is a way to pass buffers to epoll() such that
data will automatically be inserted in them when a UV_READABLE event occurs. Do you have
any advice about this ?
An overview of my Server follows.
Best Regards,
Paul Romero
Multi-Connection TCP Server Functional Architecture Overview
There is a connection descriptor for each incoming TCP connection which contains all data
needed to manage the connection and perform the relevant functions.
When the main() process detects an incoming TCP connection, it sends a notification message to the
IO_Trigger_Task(). The IO_Trigger_Task() then sets up epoll() monitoring of incoming TCP data
for that connection.
Subsequently, the IO_Task() invokes poll_callback() when incoming data is available, reads a chunk
of data, and sends a protocol message to the Protocol_Task() when a complete protocol message is
recognized.
The Timer_Task() sends an expiration notification message to the Protocol_Task() when a protocol
timer expires.
The Protocol_Task() send messages to the Send_Op_Task() for transmission across the network.
It spawns a DB Operation Task to perform slow data base operations and the DB Operation Task
notifies the Protocol_Task() when the operation is complete and then terminates.
Loops of type uv_loop_t
Tasks: All Libuv thread tasks run concurrently and are launched by main() at startup time.
main(): A Linux process that runs the Connect_Loop to detect incoming TCP connections.
The make_incoming_connection() callback routine accepts incoming connections and
allocates a uv_tcp_t handle on a per connection basis. (See main.c)
IO_Trigger_Task(): A Libuv thread that sets up epoll() plumbing for the IO_Task()
when an incoming TCP connection occurs. It allocates a uv_poll_t handle, on a per
connection basis, and calls uv_poll_start() to initiate epoll() operation with the
Poll_Loop in the IO_Task(). It configures the handle to detect UV_READABLE events and
handles them with the poll_callback() routine. However, it does not run the Poll_Loop.
(Basically, this task just sets up plumbing.) (See network_io.c)
IO_Task(): A Libuv thread that runs the Poll_Loop to handle incoming TCP data, on a per
connection basis. The poll_callback() routine executes and uses normal Linux recv() to read
chunks of data, in non-blocking mode, when a UV_READABLE event occurs.
(See network_io.c)
Timer_Task(): A Libuv thread that runs the Time_Loop to handle ticks, and whose main
function is to detect protocol timer expiration. The tick duration is configured with
is configured with uv_timer_init() and uv_timer_start(), and ticks are handled by the
timer_callback() routine. (See timer.c)
Protocol_Task(): A Libuv thread that handles protocol messages sent to it by the following tasks
on per connection basis: IO_Task(), Timer_Task(), DB Operation Tasks. DB Operation Libuv thread tasks
are spawned by the Protocol_Task() to perform slow database operations and send a notification message
to the Protocol_Task() upon completion of the operation. (See protocol.c and database.c)
Send_Op_Task(): A Libuv thread that transmits all network bound messages with normal
Linux send() on a per connection basis. (See transmit.c)
Crash Diagnostics
The crash occurs when run() is executing in the IO_Task() in network_io.c according to the following
GBD stack trace.
#0 0x00007f281754c267 in __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:55
#1 0x00007f281754deca in __GI_abort () at abort.c:89
#2 0x00007f281754503d in __assert_fail_base (fmt=0x7f28176a7028 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
assertion=assertion@entry=0x41e093 "!(handle->flags & UV_CLOSED)", file=file@entry=0x41e068 "src/unix/core.c",
line=line@entry=210, function=function@entry=0x41e2b0 <PRETTY_FUNCTION.9522> "uv__finish_close") at assert.c:92
#3 0x00007f28175450f2 in __GI___assert_fail (assertion=assertion@entry=0x41e093 "!(handle->flags & UV_CLOSED)",
file=file@entry=0x41e068 "src/unix/core.c", line=line@entry=210,
function=function@entry=0x41e2b0 <PRETTY_FUNCTION.9522> "uv__finish_close") at assert.c:101
#4 0x000000000040c967 in uv__finish_close (handle=) at src/unix/core.c:210
#5 uv__run_closing_handles (loop=0x638080 <Poll_Loop>) at src/unix/core.c:259
#6 uv_run (loop=0x638080 <Poll_Loop>, mode=UV_RUN_DEFAULT) at src/unix/core.c:326
#7 0x0000000000404962 in IO_Task (arg=0x0) at network_io.c:226
#8 0x0000000000412ad7 in uv__thread_start (arg=) at src/unix/thread.c:49
#9 0x00007f2817bf06aa in start_thread (arg=0x7f2816d15700) at pthread_create.c:333
#10 0x00007f281761deed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
However, the GDB thread information indicates that RELEASE_CONNECTION(), in protocol.c, is executing
in the Protocol_Task() when the crash occurs.
Id Target Id Frame
6 Thread 0x7f2817516700 (LWP 3424) syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
5 Thread 0x7f2816514700 (LWP 3426) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
4 Thread 0x7f2818003700 (LWP 3423) syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
3 Thread 0x7f2815512700 (LWP 3428) pthread_cond_wait@@GLIBC_2.3.2 ()
at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
2 Thread 0x7f2815d13700 (LWP 3427) 0x0000000000404500 in RELEASE_CONNECTION (cdesc=0x6384c0 <Conn_Desc_Table>)
at protocol.c:357
at ../sysdeps/unix/sysv/linux/raise.c:55
Line 357 of protocol.c is as follows.
Wait_Close[] is only modified in two cases and only in the Protocol_Task().
Code
#define CLOSE_KLUDGE
extern uv_loop_t Poll_Loop;
extern uv_loop_t Connect_Loop;
#ifdef CLOSE_KLUDGE
uv_handle_t *WaitClose[MAX_CONN_DESC] = { NULL };
#endif // CLOSE_KLUDGE
ROUTINE void close_callback(uv_handle_t *handle)
{
int k;
#ifdef CLOSE_KLUDGE
//
// Determine if the handle is being closed.
//
for(k = 0; k < MAX_CONN_DESC; k++)
{
if(WaitClose[k] == handle)
{
//
// Closure is complete.
//
WaitClose[k] = NULL;
break;
}
}
#endif // CLOSE_KLUDGE
}
ROUTINE void RELEASE_CONNECTION(CONN_DESC *cdesc)
{
uv_async_t as_handle;
struct linger spec;
#ifdef CLOSE_KLUDGE
WaitClose[cdesc->index] = (uv_handle_t *) cdesc->poll_handle;
#endif // CLOSE_KLUDGE
//
// Deactive and release the poll handle.
// You have stop the Poll_Loop to deactivate and deallocate the poll handle.
//
uv_stop(&Poll_Loop);
#ifdef CLOSE_KLUDGE
//
// Wait for the handle to be closed and deallocated.
//
while(WaitClose[cdesc->index]);
#endif // CLOSE_KLUDGE
}
#ifdef CLOSE_KLUDGE
WaitClose[cdesc->index] = (uv_handle_t *) cdesc->conn_handle;
#endif // CLOSE_KLUDGE
//
// Close and deallocate the connect handle in order to close the socket connecction.
// You have to wake up the Connect_Loop for the close_callback()
// routine to execute.
//
uv_close((uv_handle_t *) cdesc->conn_handle, close_callback);
//
// Wake up the Connect_Loop in the main() process.
//
uv_async_init(&Connect_Loop, &as_handle, NULL);
uv_async_send(&as_handle);
uv_close((uv_handle_t *) &as_handle, NULL);
#ifdef CLOSE_KLUDGE
//
// Wait for the handle and socket connection to be release and closed.
//
while(WaitClose[cdesc->index]);
#endif // CLOSE_KLUDGE
}
ENTER_MUTEX(&Service_Q_Mutex);
DELETE_CONN(cdesc);
cdesc->fd = -1;
flush_msg(&cdesc->task_input_q);
EXIT_MUTEX(&Service_Q_Mutex);
}
The text was updated successfully, but these errors were encountered: