Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socket Freezed on RabbitMq disconnection on windows #297

Closed
allancomar opened this issue Aug 21, 2015 · 9 comments
Closed

Socket Freezed on RabbitMq disconnection on windows #297

allancomar opened this issue Aug 21, 2015 · 9 comments
Labels
Milestone

Comments

@allancomar
Copy link

Hi. Looking at the differences between master version and 0.6 we notice that you changed the select to use an infinite time on code:

#elif defined(HAVE_SELECT)
  fd_set fds;
  int res;
  struct timeval tv;
  struct timeval *tvp;

  assert(event == AMQP_SF_POLLIN || event == AMQP_SF_POLLOUT);

start_select:
  FD_ZERO(&fds);
  FD_SET(fd, &fds);

  res = amqp_time_tv_until(deadline, &tv, &tvp);
  if (res != AMQP_STATUS_OK) {
    return res;
  }

  switch (event) {
    case AMQP_SF_POLLIN:
      res = select(fd + 1, &fds, NULL, NULL, tvp);
      break;
    case AMQP_SF_POLLOUT:
      res = select(fd + 1, NULL, &fds, NULL, tvp);
  }

  if (0 < res) {
    return AMQP_STATUS_OK;
  } else if (0 == res) {
    return AMQP_STATUS_TIMEOUT;
  } else {
    switch (amqp_os_socket_error()) {
      case EINTR:
        goto start_select;
      default:
        return AMQP_STATUS_SOCKET_ERROR;
    }
  }
  return AMQP_STATUS_OK;

If we try to connect to a closed port this code in windows stays on the select even if the port got open after ( like when you restart rabbitMq).

Do you have any sugestion to address this behavior?

Thanks, Allan

@alanxz
Copy link
Owner

alanxz commented Aug 24, 2015

The implementation changed from poll() to select() in 0.7 of rabbitmq-c, the effective timeout should not have changed between v0.6 and v0.7. Unfortunately I don't have access to a Win32 machine these days. Could you provide an example that uses this code that reproduces the issue you're seeing here?

@allancomar
Copy link
Author

Hi. I just made a simple change on the code using the exception set. I am attaching the patch file here. If you want we can provide some assist to test and / or mantain the windows part of the code.

Left file: E:\Temp\rabbitmq-c-master\librabbitmq\amqp_socket.c
Right file: E:\projetos\rabbitmq-c-master\librabbitmq\amqp_socket.c
305a306
>   fd_set fdexceptset;
314a316,318
>   FD_ZERO(&fdexceptset);
>   FD_SET(fd, &fdexceptset);
>   
323c327
<       res = select(fd + 1, &fds, NULL, NULL, tvp);
---
>       res = select(fd + 1, &fds, NULL, &fdexceptset, tvp);
326c330
<       res = select(fd + 1, NULL, &fds, NULL, tvp);
---
>       res = select(fd + 1, NULL, &fds, &fdexceptset, tvp);
328a333,335
>   if (FD_ISSET(fd, &fdexceptset))
>     return AMQP_STATUS_SOCKET_ERROR;
> 
332c339
<     return AMQP_STATUS_TIMEOUT;
---
>     goto start_select;

Thank you, Allan

@allancomar
Copy link
Author

oh my. Can I send you the patch in any other way?

See you!

@alanxz
Copy link
Owner

alanxz commented Aug 25, 2015

If you have a patch, please open a pull-request.

For my own reference, select(), the exceptfds need to be set to detect connection failure. I believe it was this way before, then somehow got lost when we switched from select -> poll -> nonblocking -> select() only on Win32.

@allancomar
Copy link
Author

just commited the code. Is anything else I should do?

@alanxz
Copy link
Owner

alanxz commented Aug 26, 2015

Could you open a pull-request with the changed code?

Here's some instructions on how to get this done if you're not familiar:
https://help.github.com/articles/using-pull-requests/

alanxz added a commit that referenced this issue Oct 14, 2015
When doing a nonblocking connect() on win32, select() reports failure using
exceptfds instead of writefds. Allow this narrow case when doing a non-blocking
connect on Win32.

See:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740141(v=vs.85).aspx

Fixes #297
@alanxz alanxz added this to the v0.8.0 milestone Oct 14, 2015
@alanxz alanxz added the t:bug label Oct 14, 2015
@alanxz
Copy link
Owner

alanxz commented Oct 14, 2015

@allancomar could you try pulling down the branch referenced in #306 and see if it fixes your issue?

alanxz added a commit that referenced this issue Oct 14, 2015
When doing a nonblocking connect() on win32, select() reports failure using
exceptfds instead of writefds. Allow this narrow case when doing a non-blocking
connect on Win32.

See:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740141(v=vs.85).aspx

Fixes #297
alanxz added a commit that referenced this issue Oct 14, 2015
When doing a nonblocking connect() on win32, select() reports failure using
exceptfds instead of writefds. Allow this narrow case when doing a non-blocking
connect on Win32.

See:
https://msdn.microsoft.com/en-us/library/windows/desktop/ms740141(v=vs.85).aspx

Fixes #297
@allancomar
Copy link
Author

Sry Alan. I just miss that. will do it ASAP ;)

On Wed, Oct 14, 2015 at 12:48 AM, Alan Antonuk notifications@github.com
wrote:

@allancomar https://github.com/allancomar could you try pulling down
the branch referenced in #306
#306 and see if it fixes your
issue?


Reply to this email directly or view it on GitHub
#297 (comment).

Allan.

@zengzhenbo
Copy link

I met same problem on Linux , when use SimpleAMQPClient to create channel by CreateFromUri, and the version of rabbitmq-c is v0.7.1
The latest release noted that fixed the bug: Connection failure results in hang on Win32
Is this fixed solve the problem on Linux version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants