Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

monitor_operation::PostAndWait prevents shutdown #31

Open
akapelrud opened this issue Mar 16, 2016 · 13 comments
Open

monitor_operation::PostAndWait prevents shutdown #31

akapelrud opened this issue Mar 16, 2016 · 13 comments

Comments

@akapelrud
Copy link
Contributor

The current implementation of monitor_operation::PostAndWait prevents the call to async_monitor_thread_.join(); in shutdown_service() from finishing. PostAndWait waits on a std::condition_variable that is never notified as the lambda handler is newer run after shutdown. This prevents the thread from ever finishing.

Either remove the waiting part, or possibly do something like

void PostAndWait(const boost::system::error_code ec, const dir_monitor_event& ev) const
        {
            std::mutex post_mutex;
            std::condition_variable post_condition_variable;
            bool post_cancel = false;

            this->io_service_.post(
                [&]
                {
                    handler_(ec, ev);
                    std::lock_guard<std::mutex> lock(post_mutex);
                    post_cancel = true;
                    post_condition_variable.notify_one();
                }
            );
            std::unique_lock<std::mutex> lock(post_mutex);
            while (!post_cancel)
            {
                post_condition_variable.wait_for(lock, std::chrono::milliseconds(1000));
                if(this->io_service_.stopped())
                    break;
            }
        }
@berkus
Copy link
Owner

berkus commented Mar 24, 2016

Do you want to make a PR for this?

@sbelsky
Copy link
Contributor

sbelsky commented Mar 24, 2016

... and test case, please ;)

@sbelsky
Copy link
Contributor

sbelsky commented Mar 24, 2016

This method ensures the correctness of dir_monitor service completion in case of destruction of external io_service (which passed to basic_dir_monitor_service in constructor). You can see the attempt to simulate this situation in blocked_async_call, so your second suggestion is better.

@akapelrud
Copy link
Contributor Author

I'm away from my computer for the remainder of easter (until monday), so I won't be able to test this properly until then.

The problem I see with using a condition variable (a synchronization device) is that it prevents the interal (implementation) thread from ever finishing. Your testcase will be valid, but if you were to test this in a (multi)threaded application where your external/local io_service is stopped, you would find that your application never closes. I'll ponder on how to write a proper test case for this, until I can get back to my computer.

@akapelrud
Copy link
Contributor Author

Question after reading the comment in the testcase: why would you ever want async_monitor to block?

  • async_monitor is essentially a call to io_service.post(), which should be nonblocking.
    boost::asio picks up your work from post() and delivers the result to you upon invocation of your io_service's run()-function (which is blocking, possibly multiple threads). Blocking your service's implementation thread on something that might never happen in real life, (i.e. a dir event), seems like a very bad idea (Notice that you forced an event in your test case).

@berkus
Copy link
Owner

berkus commented Mar 30, 2016

async processing most probably needs more fixes.

@akapelrud
Copy link
Contributor Author

As per the boost documentation: "A service's shutdown_service member function must cause all copies of user-defined handler objects that are held by the service to be destroyed.".

When shutdown_service is called, the implementation is destroyed leading to the notification of the condition variable in popfront_event(). After popfront_event() returns there is a post to the external io_service of the supplied handler.
I think this is where the current code is wrong. It is the service's responsibility to release that handler in shutdown_service (as the boost requirement states), so we can't post it back to the external thread here.

A simple fix to this problem is to not post to the external handler if the error code is an operation_aborted event after the return to popfront_event. PostAndWait() goes out the window as well. This seems to solve my initial problem, as well as passing the current test cases; i.e. blocked_async_call_handler is never reached in this case, and the blocked_async_call test case finishes like it should.

I'll see if I can create a PR and a second test case for my initial problem (against the new develop branch).

@berkus
Copy link
Owner

berkus commented Apr 2, 2016

This is supposed to be fixed by PR #40 right?

@sbelsky
Copy link
Contributor

sbelsky commented Apr 2, 2016

@berkus this issue is about Windows implementation while PR #40 affects *nix.
but.. @akapelrud can we propogate PR #40 solution to current?

@akapelrud
Copy link
Contributor Author

@berkus Yes, PR #40 was the fix for this issue.

@sbelsky My issue was actually with the inotify implementation, but I now see that the windows implementation was similar, so I guess the problem exists there as well. I haven't tried to compile the project on Windows (I need to install MSVC first), but It seems likely that the contents of PR #40 needs to mirrored to windows/basic_dir_monitor_service.hpp as well.

@sbelsky
Copy link
Contributor

sbelsky commented Apr 2, 2016

fast check give me unsatisfactory result (((( blocked_async_call stably fall down in appveyor CI environment (Win Server 2012 / VS2013). this is a main reason i can't make PR with my "add appveyor support" branch.

@akapelrud
Copy link
Contributor Author

@sbelsky I'll install MSVC (probably VS2015) and test this further this week.

@sbelsky
Copy link
Contributor

sbelsky commented Apr 4, 2016

@akapelrud i bet you'll not see any issue with your local assembly.
i've tested it with Win 8 / VS2013 pair and all were right. the bug only reproduses at appveyor server. at this moment i'm thinking about how can i obtain symbols and dump for further analize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants