Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ovn-trace: need to read latest data when using daemon mode #168

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

gujun4990
Copy link
Contributor

@gujun4990 gujun4990 commented Dec 20, 2022

When logical flow changed in SB, the result that ovn-trace evaluate using daemon mode is previous. Because the data from db just is read once when ovn-trace startup. This situation is not adapted for daemon mode. So we need to keep latest data from SB when ovn-trace use daemon mode.

In addition, print the unixctl server path used to be recorded refer from ovn-nbctl and ovn-sbctl.

Signed-off-by: Jun Gu [email protected]

@gujun4990
Copy link
Contributor Author

gujun4990 commented Dec 26, 2022

@dceara I do some optimizations for ovn-trace, plz help to review this patch. Thanks!
In addition, I am considering that the already_read variable should be removed.

  • After ovn-trace has connected to SB, run read_db once and after trace to return for non-daemon mode.
  • For daemon mode, need to read latest data from SB and will run read_db when SB data changed.

So I think the already_read variable can be removed.

-    bool already_read = false;
     for (;;) {
         ovsdb_idl_run(ovnsb_idl);
         unixctl_server_run(server);
@@ -150,10 +156,7 @@ main(int argc, char *argv[])
         }
 
         if (ovsdb_idl_has_ever_connected(ovnsb_idl)) {
-            if (!already_read) {
-                already_read = true;
-                read_db();
-            }
+            read_db();
 
             daemonize_complete();
             if (!get_detach()) {

read_db();
}

read_db();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't review this thoroughly but checking what read_db() does I see we'll be doing this at every iteration:

static void
read_db(void)
{
    read_datapaths();
    read_ports();
    read_mcgroups();
    read_address_sets();
    read_port_groups();
    read_gen_opts();
    read_flows();
    read_mac_bindings();
    read_fdbs();
}

That doesn't seem OK because we never cleanup any of the maps we populate there. E.g., read_datapaths() will reinsert the same datapaths over and over again in the global struct hmap datapaths variable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I observe that the memory keeps growing if using the above code. I will re-optimize the code according to your advice.

Copy link
Collaborator

@dceara dceara left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to incrementally process changes (or clear global data at every iteration) if we want to process DB updates.

@gujun4990
Copy link
Contributor Author

We need to incrementally process changes (or clear global data at every iteration) if we want to process DB updates.

I will try to optimize it as you suggested.

@gujun4990
Copy link
Contributor Author

gujun4990 commented Jan 12, 2023

I try to clear global data at every iteration and the code is as follows:

static void
ovntrace_fdbs_destroy(struct hmap *fdbs)
{
    struct ovntrace_fdb *fdb;
    HMAP_FOR_EACH_POP (fdb, node, fdbs) {
        free(fdb);
    }
    hmap_destroy(fdbs);
}

static void
ovntrace_mac_bindings_destroy(struct hmap *bindings)
{
    struct ovntrace_mac_binding *binding;
    HMAP_FOR_EACH_POP (binding, node, bindings) {
        free(binding);
    }
    hmap_destroy(bindings);
}

static void
ovntrace_flows_destroy(struct ovntrace_flow **flows, size_t n_flows)
{
    for (size_t i = 0; i < n_flows; i++) {
        struct ovntrace_flow *flow = flows[i];
        free(flow->stage_name);
        free(flow->source);
        expr_destroy(flow->match);
        ovnacts_free(flow->ovnacts, flow->ovnacts_len);
        free(flow);
    }
    free(flows);
}

static void
ovntrace_mcgroups_destroy(struct ovs_list *mcgroups)
{
    struct ovntrace_mcgroup *mcgroup;
    LIST_FOR_EACH_POP (mcgroup, list_node, mcgroups) {
        free(mcgroup->name);
        free(mcgroup->ports);
        free(mcgroup);
    }
}

static void
ovntrace_datapaths_destroy(void)
{
    struct ovntrace_datapath *dp;
    HMAP_FOR_EACH_POP (dp, sb_uuid_node, &datapaths) {
        ovntrace_mcgroups_destroy(&dp->mcgroups);
        ovntrace_mac_bindings_destroy(&dp->mac_bindings);
        ovntrace_fdbs_destroy(&dp->fdbs);
        ovntrace_flows_destroy(dp->flows, dp->n_flows);

        free(dp->name);
        free(dp->name2);
        free(dp->friendly_name);
        free(dp);
    }
    hmap_destroy(&datapaths);
}

static void
ovntrace_ports_destroy(void)
{
    struct shash_node *node, *next;
    SHASH_FOR_EACH_SAFE (node, next, &ports) {
        struct ovntrace_port *port = node->data;
        free(port->name);
        free(port->type);
        free(port->name2);
        free(port->friendly_name);
        free(port);
        shash_delete(&ports, node);
    }
    shash_destroy(&ports);
}

static void
address_sets_destroy(void)
{
    expr_const_sets_destroy(&address_sets);
    shash_destroy(&address_sets);
}

static void
port_groups_destroy(void)
{
    expr_const_sets_destroy(&port_groups);
    shash_destroy(&port_groups);
}

static void
clear_data(void)
{
    static bool first_clear = true;
    if (first_clear) {
        first_clear = false;
    } else {
        ovntrace_datapaths_destroy();
        ovntrace_ports_destroy();
        address_sets_destroy();
        port_groups_destroy();
        dhcp_opts_destroy(&dhcp_opts);
        dhcp_opts_destroy(&dhcpv6_opts);
        nd_ra_opts_destroy(&nd_ra_opts);
        controller_event_opts_destroy(&event_opts);
        expr_symtab_destroy(&symtab);
        shash_destroy(&symtab);
    }
}

But the memory will still continue to grow when run the above code as daemon mode, but slower than before. So I wonder whether there are some tools can be used to analyze this situation. @dceara

@dceara
Copy link
Collaborator

dceara commented Jan 12, 2023

But the memory will still continue to grow when run the above code as daemon mode, but slower than before. So I wonder whether there are some tools can be used to analyze this situation. @dceara

We run all our unit tests with AddressSanitizer enabled. You could try that too. You need to build with the sanitizer enabled, e.g.:

ovn/.ci/linux-build.sh

Lines 47 to 48 in 52da241

COMMON_CFLAGS="${COMMON_CFLAGS} -O1 -fno-omit-frame-pointer -fno-common -g"
OVN_CFLAGS="${OVN_CFLAGS} -fsanitize=address,undefined"

And then configure address sanitizer to track memory leaks too before starting the ovn-trace daemon, e.g.:

ovn/tests/atlocal.in

Lines 214 to 215 in 52da241

ASAN_OPTIONS=detect_leaks=1:abort_on_error=true:log_path=sanitizers:$ASAN_OPTIONS
export ASAN_OPTIONS

When the ovn-trace daemon exits it will write a sanitizers.<pid> file if there were any memory leaks.

Hope this helps!

@gujun4990
Copy link
Contributor Author

gujun4990 commented Jan 14, 2023

I re-optimize ovn-trace code. The memory will not keep to grow when using daemon mode. In next stage, we plan to use I-P engine to optimize ovn-trace.

@gujun4990 gujun4990 closed this Jan 14, 2023
@gujun4990 gujun4990 reopened this Jan 14, 2023
When logical flow changed in SB, the result that ovn-trace evaluate
using daemon mode is previous. Because the data from db just is read
once when ovn-trace startup. This situation is not adapted for daemon
mode. So we need to keep latest data from SB when ovn-trace use
daemon mode.

In addition, print the unixctl server path used to be recorded refer
from ovn-nbctl and ovn-sbctl.

Signed-off-by: Jun Gu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants