Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

in_forward: SIGSEGV error when ACK is enabled due to null connection in send_ack #9443

Closed
mirko-lazarevic opened this issue Sep 30, 2024 · 1 comment · Fixed by #9605
Closed

Comments

@mirko-lazarevic
Copy link
Contributor

Bug Report

Describe the bug
I'm experiencing a segmentation fault (SIGSEGV) when ACK is enabled in Fluent Bit. It appears that when ingestion is paused ([ warn] [input] forward.0 paused (mem buf overlimit) ) all active connections are being closed before send_ack is executed, leading to a null pointer dereference.

In the following code snippet from fw_prot.c

/* Handle ACK response */
if (chunk_id != -1) {
    chunk = root.via.array.ptr[2].via.map.ptr[chunk_id].val;
    send_ack(ctx->ins, conn, chunk);
}

The conn->connection is null when send_ack is called, causing the crash.

To Reproduce

  • Example log message if applicable:
[2024/09/27 16:40:07] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:07] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:07] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:07] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:07] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:07] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:07] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:07] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:08] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:08] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:08] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:08] [ info] [input:forward:forward.0] ==== sending ack message back ====
[2024/09/27 16:40:08] [ warn] [input] forward.0 paused (mem buf overlimit)
[2024/09/27 16:40:08] [ info] [input] pausing forward.0
[2024/09/27 16:40:08] [ info] [input:forward:forward.0] ==== deleting all connections ====
[2024/09/27 16:40:08] [ info] [input:forward:forward.0] ==== all connections have been deletes ====
[2024/09/27 16:40:08] [ info] [input:forward:forward.0] ==== sending ack message back ====
Process 30395 stopped
* thread #17, name = 'flb-pipeline', stop reason = EXC_BAD_ACCESS (code=1, address=0x190)
    frame #0: 0x00000001000971e0 fluent-bit`flb_connection_get_flags(connection=0x0000000000000000) at flb_connection.c:199:45
   196
   197  int flb_connection_get_flags(struct flb_connection *connection)
   198  {
-> 199      return flb_stream_get_flags(connection->stream);
   200  }
   201
   202  void flb_connection_reset_connection_timeout(struct flb_connection *connection)
Target 0: (fluent-bit) stopped.
  • Steps to reproduce the problem:
  1. Enable ACK by setting up chunk_id before sending data to input forward plugin
  2. Run fluent-bit in an env where it can exceed the memory buffer limit
  3. Wait until the memory buffer overlimit condition triggers

Expected behavior
Fluent Bit should handle the ACK response gracefully, even when active connections are closed due to memory buffer overlimit or other conditions.

Screenshots

Your Environment

  • Version used: latest
  • Configuration:
[SERVICE]
    Log_Level       info
    HTTP_Server     On
    HTTP_Listen     0.0.0.0
    HTTP_Port       8085
    Parsers_File    parsers.conf

[INPUT]
    Name                forward
    Listen              127.0.0.1
    Port                24224
    Buffer_Chunk_Size   512KB
    Buffer_Max_Size     2MB
    Mem_Buf_Limit       10MB

[OUTPUT]
    Name                null
    Match              *

  • Environment name and version (e.g. Kubernetes? What version?):
  • Server type and version:
  • Operating System and version: macOs
  • Filters and plugins:

Additional context

@drbugfinder-work
Copy link
Contributor

This is related to #9288 (comment)

I will look at this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants