Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Adbc.Cluster to load multiple Go implemented ADBC drivers #73

Closed
cocoa-xu opened this issue May 10, 2024 · 0 comments
Closed

Comments

@cocoa-xu
Copy link
Member

cocoa-xu commented May 10, 2024

As reported in apache/arrow-adbc#1841 and golang/go#65050, there can be issues when loading multiple Go implemented ADBC drivers.

Currently, Go 1.21.8 is used for building some ADBC drivers and this issue can only be reproduced on x86_64 macOS. However, according to this reply

Using multiple c-shared libraries in the same process is never really supported. Currently the c-shared library assumed it is the only copy of the Go runtime in the process. Unlike plugins, it doesn't try to see if there is any other Go runtime loaded in the process and integrate with them. Having multiple c-shared libraries in the same process might work in some simple cases, where each shared library mostly works in isolation. But if the program passes pointers around, weird things can happen.

You could try building them into a single c-shared library, or using plugins. Thanks.

which means this issue could potentially happen on other platforms/architectures.

While I agree that most consumers are probably not going to use multiple drivers at the same time, we can have an alternative solution for Elixir users (who want/have to do that) by spawning remote nodes and loading one Go implemented ADBC driver in a dedicated remote node.

And below is some proof-of-concept code (mostly borrowed from phoenixframework/phoenix_pubsub)

adbc_cluster.ex

defmodule Adbc.Cluster do
  def spawn(drivers) do
    # Turn node into a distributed node with the given long name
    :net_kernel.start([:"[email protected]"])

    # Allow spawned nodes to fetch all code from this node
    :erl_boot_server.start([])
    allow_boot(to_charlist("127.0.0.1"))

    drivers
    |> Enum.map(&Task.async(fn -> spawn_node(&1) end))
    |> Enum.map(&Task.await(&1, 30_000))
  end

  defp spawn_node({node_host, driver}) do
    {:ok, node} = :slave.start(to_charlist("127.0.0.1"), node_name(node_host), inet_loader_args())
    add_code_paths(node)
    transfer_configuration(node)
    ensure_applications_started(node)
    start_driver(node, driver)
    {:ok, node}
  end

  def rpc(node, module, function, args) do
    :rpc.block_call(node, module, function, args)
  end

  defp inet_loader_args do
    to_charlist("-loader inet -hosts 127.0.0.1 -setcookie #{:erlang.get_cookie()}")
  end

  defp allow_boot(host) do
    {:ok, ipv4} = :inet.parse_ipv4_address(host)
    :erl_boot_server.add_slave(ipv4)
  end

  defp add_code_paths(node) do
    rpc(node, :code, :add_paths, [:code.get_path()])
  end

  defp transfer_configuration(node) do
    for {app_name, _, _} <- Application.loaded_applications() do
      for {key, val} <- Application.get_all_env(app_name) do
        rpc(node, Application, :put_env, [app_name, key, val])
      end
    end
  end

  defp ensure_applications_started(node) do
    rpc(node, Application, :ensure_all_started, [:mix])
    rpc(node, Mix, :env, [Mix.env()])

    for {app_name, _, _} <- Application.loaded_applications() do
      rpc(node, Application, :ensure_all_started, [app_name])
    end
  end

  defp start_driver(node, driver) do
    args = [
      driver,
      [strategy: :one_for_one]
    ]

    rpc(node, Supervisor, :start_link, args)
  end

  defp node_name(node_host) do
    node_host
    |> to_string
    |> String.split("@")
    |> Enum.at(0)
    |> String.to_atom()
  end
end

test/multiple-go-drivers.exs

defmodule Adbc.Driver.Test do
  use ExUnit.Case

  alias Adbc.Connection

  setup do
    flightsql = [
      {Adbc.Database, driver: :flightsql, process_options: [name: MyApp.FlightSQL]},
      {Adbc.Connection, database: MyApp.FlightSQL, process_options: [name: MyApp.FlightSQLConn]}
    ]

    snowflake = [
      {Adbc.Database, driver: :snowflake, process_options: [name: MyApp.Snowflake]},
      {Adbc.Connection, database: MyApp.Snowflake, process_options: [name: MyApp.SnowflakeConn]}
    ]

    [{:ok, flightsql}, {:ok, snowflake}] =
      Adbc.Cluster.spawn([
        {"flightsql", flightsql},
        {"snowflake", snowflake}
      ])

    %{flightsql: flightsql, snowflake: snowflake}
  end

  test "load multiple drivers", %{flightsql: flightsql, snowflake: snowflake} do
    Adbc.Cluster.rpc(flightsql, Adbc.Connection, :query, [
      MyApp.FlightSQLConn,
      "query that goes to flightsql"
    ])

    Adbc.Cluster.rpc(snowflake, Adbc.Connection, :query, [
      MyApp.SnowflakeConn,
      "some other query that goes to snowflake"
    ])
  end
end
@josevalim josevalim reopened this May 23, 2024
@josevalim josevalim closed this as not planned Won't fix, can't repro, duplicate, stale May 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants