Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WebSocket Keep-Alive Ping and Timeout (minimal) implementation #105841

Merged
merged 14 commits into from
Aug 9, 2024

Conversation

CarnaViire
Copy link
Member

@CarnaViire CarnaViire commented Aug 1, 2024

This is a minimal implementation of the API without additional read-ahead logic. In order for the feature to work (to consume Pongs as they arrive in response to the Pings), there needs to be an outstanding user read all/most of the time.

I also tried to minimize any changes to common code, to try to make sure the feature won't negatively affect any existing scenarios. I still kept most of the the tracing events initially added to the draft implementation with the read-ahead (#105672). The read-ahead capability will be added separately.

Fixes #48729

/cc @stephentoub @BrennanConroy @rzikm @karelz @dotnet/ncl

Logging:

  • Added private NetEventSource logs (same as e.g. "Private.InternalDiagnostics.System.Net.Http" etc)
  • Note: there was no logging in WebSockets at all prior to this
  • The logs proven to be extremely useful when debugging/investigating
  • Will help us troubleshoot in case any issues surface in the future (even unrelated to the feature)
  • Perf measurements linked are with the logging off but still present in code

Testing:

  • CI (functional tests, existing&new)
  • CI outerloop (tests with external servers)
  • Microbenchmarks (perf+stress)
  • Kestrel tests (on private bits)
  • Autobahn Server tests (on private bits)

(UPD: 2024-08-08) System.Net.WebSockets.Tests.SocketSendReceivePerfTest report

BenchmarkDotNet v0.13.13-nightly.20240311.145, Windows 11 (10.0.22621.3880/22H2/2022Update/SunValley2)
Intel Core i7-9750H CPU 2.60GHz, 1 CPU, 12 logical and 6 physical cores
.NET SDK 9.0.100-preview.7.24362.28
  [Host]     : .NET 9.0.0 (9.0.24.35702), X64 RyuJIT AVX2
  Job-DUWIHO : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX2
  Job-UUOYCU : .NET 9.0.0 (42.42.42.42424), X64 RyuJIT AVX2

PowerPlanMode=00000000-0000-0000-0000-000000000000  IterationTime=250ms  MaxIterationCount=50
MinIterationCount=30  WarmupCount=20

| Method      | Job        | Toolchain         | Mean     | Error    | StdDev   | Median   | Min      | Max      | Ratio | RatioSD | Allocated | Alloc Ratio |
|------------ |----------- |------------------ |---------:|---------:|---------:|---------:|---------:|---------:|------:|--------:|----------:|------------:|
| SendReceive | Job-DUWIHO | \main\corerun.exe | 24.97 μs | 0.627 μs | 1.223 μs | 24.87 μs | 22.82 μs | 28.34 μs |  1.00 |    0.07 |         - |          NA |
| SendReceive | Job-UUOYCU | \pr\corerun.exe   | 24.28 μs | 0.716 μs | 1.413 μs | 24.50 μs | 21.67 μs | 27.65 μs |  0.97 |    0.07 |         - |          NA |
|             |            |                   |          |          |          |          |          |          |       |         |           |             |
| ReceiveSend | Job-DUWIHO | \main\corerun.exe | 66.20 μs | 4.430 μs | 8.949 μs | 68.07 μs | 50.78 μs | 83.30 μs |  1.02 |    0.20 |     668 B |        1.00 |
| ReceiveSend | Job-UUOYCU | \pr\corerun.exe   | 65.27 μs | 3.595 μs | 7.179 μs | 66.05 μs | 49.60 μs | 82.95 μs |  1.00 |    0.18 |     661 B |        0.99 |

Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

1 similar comment
Copy link

Note regarding the new-api-needs-documentation label:

This serves as a reminder for when your PR is modifying a ref *.cs file and adding/modifying public APIs, please make sure the API implementation in the src *.cs file is documented with triple slash comments, so the PR reviewers can sign off that change.

@CarnaViire
Copy link
Member Author

/azp run runtime-libraries-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@rzikm rzikm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rzikm
Copy link
Member

rzikm commented Aug 2, 2024

There seem to be a related test crash, but it's on OSX so I can't check which test it is coming from.

Process terminated. Assertion failed.
_keepAlivePingState.AwaitingPong
   at System.Diagnostics.DebugProvider.Fail(String message, String detailMessage) in /_/src/libraries/System.Private.CoreLib/src/System/Diagnostics/DebugProvider.cs:line 22
   at System.Diagnostics.Debug.Fail(String message, String detailMessage) in /_/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Debug.cs:line 134
   at System.Diagnostics.Debug.Assert(Boolean condition, String message, String detailMessage) in /_/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Debug.cs:line 98
   at System.Diagnostics.Debug.Assert(Boolean condition, String message) in /_/src/libraries/System.Private.CoreLib/src/System/Diagnostics/Debug.cs:line 87
   at System.Net.WebSockets.ManagedWebSocket.KeepAlivePingThrowIfTimedOut() in /_/src/libraries/System.Net.WebSockets/src/System/Net/WebSockets/ManagedWebSocket.KeepAlive.cs:line 143
   at System.Net.WebSockets.ManagedWebSocket.KeepAlivePingHeartBeat() in /_/src/libraries/System.Net.WebSockets/src/System/Net/WebSockets/ManagedWebSocket.KeepAlive.cs:line 112
   at System.Net.WebSockets.ManagedWebSocket.HeartBeat() in /_/src/libraries/System.Net.WebSockets/src/System/Net/WebSockets/ManagedWebSocket.KeepAlive.cs:line 72
   at System.Net.WebSockets.ManagedWebSocket.<>c.<.ctor>b__34_0(Object s) in /_/src/libraries/System.Net.WebSockets/src/System/Net/WebSockets/ManagedWebSocket.cs:line 203
   at System.Threading.TimerQueueTimer.<>c.<.cctor>b__30_0(Object state) in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 742
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) in /_/src/libraries/System.Private.CoreLib/src/System/Threading/ExecutionContext.cs:line 179
   at System.Threading.TimerQueueTimer.CallCallback(Boolean isThreadPool) in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 733
   at System.Threading.TimerQueueTimer.Fire(Boolean isThreadPool) in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 686
   at System.Threading.TimerQueue.FireNextTimers() in /_/src/libraries/System.Private.CoreLib/src/System/Threading/Timer.cs:line 334
   at System.Threading.TimerQueue.System.Threading.IThreadPoolWorkItem.Execute() in /_/src/libraries/System.Private.CoreLib/src/System/Threading/TimerQueue.Portable.cs:line 140
   at System.Threading.ThreadPoolWorkQueue.Dispatch() in /_/src/libraries/System.Private.CoreLib/src/System/Threading/ThreadPoolWorkQueue.cs:line 1099
   at System.Threading.PortableThreadPool.WorkerThread.WorkerThreadStart() in /_/src/libraries/System.Private.CoreLib/src/System/Threading/PortableThreadPool.WorkerThread.cs:line 128
   at System.Threading.Thread.StartCallback() in /_/src/mono/System.Private.CoreLib/src/System/Threading/Thread.Mono.cs:line 218

Copy link
Member

@MihaZupan MihaZupan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The read-ahead capability will be added separately.

Do we need it?
Presumably most WebSocket users will have pending reads at all times already (I'm guessing we're relying on that with the current approach).

@rzikm rzikm self-assigned this Aug 2, 2024
@CarnaViire
Copy link
Member Author

/azp run runtime-libraries-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@ManickaP ManickaP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM, most of the comments are "stylistical". The only one functional is the exception type depending on timing.

I also noticed, there's a lot of minor product code changes just for the sake of logging. I assume your perf measurements are done with these changes (with logging off obviously) and they do not affect it.

@CarnaViire
Copy link
Member Author

/azp run runtime-libraries-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@CarnaViire
Copy link
Member Author

/azp run runtime-libraries-coreclr outerloop

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@karelz
Copy link
Member

karelz commented Aug 9, 2024

This is important feature for ASP.NET, we should get it in ASAP today (Fri 8/9). I approve.
@CarnaViire please merge it at your earliest convenience.

@stephentoub given it is larger change, can you please review it as well post merge? If there is anything critical, let us know. We can address non-critical feedback in follow up PR.

@karelz karelz added this to the 9.0.0 milestone Aug 9, 2024
@CarnaViire
Copy link
Member Author

All CI failures are unrelated known issues (build analysis green)

@CarnaViire CarnaViire merged commit dce7a21 into dotnet:main Aug 9, 2024
87 of 100 checks passed
@CarnaViire
Copy link
Member Author

Feedback follow-up in #107662

@github-actions github-actions bot locked and limited conversation to collaborators Oct 12, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add WebSocketOptions.Timeout to ClientWebSocket
7 participants