-
-
Notifications
You must be signed in to change notification settings - Fork 98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stable sort has major performance degradation #129
Comments
In the meantime, it should be easy to replicate with a graph that is |
Thank you for filing this issue. I did some research and decided to proceed as follows: I will create a patch release that reverts Then I'm going to take a look at the implementation of I've created a program that will create a disconnected directed graph consisting of simple vertex chains and run package main
import (
"github.com/dominikbraun/graph"
)
const (
numberOfVertices = 12
lengthOfChains = 4
)
func main() {
if numberOfVertices%lengthOfChains != 0 {
panic("numberOfVertices must be divisible by lengthOfChains")
}
numberOfRows := numberOfVertices / lengthOfChains
g := graph.New(graph.IntHash, graph.Directed())
for i := 1; i <= numberOfVertices; i++ {
_ = g.AddVertex(i)
}
vertex := 1
for i := 1; i <= numberOfRows; i++ {
for j := 1; j <= lengthOfChains; j++ {
if j < lengthOfChains {
if err := g.AddEdge(vertex, vertex+1); err != nil {
panic(err)
}
}
vertex++
}
}
_, _ = graph.StableTopologicalSort(g, func(a, b int) bool {
return a < b
})
} The number of vertices and the chain length can be set using Once this is fixed, I'll release another patch release containing the optimized implementation. |
I created a slight variant that has a much larger graph, and includes generating a profile file: package main
import (
"fmt"
"log"
"os"
"runtime/pprof"
"github.com/dominikbraun/graph"
)
const (
numberOfVertices = 500
lengthOfChains = 125
profFile = "/tmp/cpu.prof"
)
func main() {
if numberOfVertices%lengthOfChains != 0 {
panic("numberOfVertices must be divisible by lengthOfChains")
}
numberOfRows := numberOfVertices / lengthOfChains
g := graph.New(graph.IntHash, graph.Directed())
for i := 1; i <= numberOfVertices; i++ {
_ = g.AddVertex(i)
}
vertex := 1
for i := 1; i <= numberOfRows; i++ {
for j := 1; j <= lengthOfChains; j++ {
if j < lengthOfChains {
if err := g.AddEdge(vertex, vertex+1); err != nil {
panic(err)
}
}
vertex++
}
}
f, err := os.Create(profFile)
if err != nil {
log.Fatal(err)
}
defer f.Close()
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
_, _ = graph.StableTopologicalSort(g, func(a, b int) bool {
return a < b
})
fmt.Printf("saved output profile to %s\n", profFile)
} Even with a 500-node graph, it still is giving only 250ms to sort the graph. How strange that mine give orders of magnitude more. And I was careful here to profile only starting before the sort, not before the creation. I wonder if it could have to do with the node type? I will try that next. |
FYI, I tried adding |
I modified it to give different string sizes, and different chain sizes. Here are some interesting outputs.
We can try and get that into a table, but basically:
No idea what is going on. I can try and convert this into a general testing regimen program. here is the basic one I am using: package main
import (
"fmt"
"log"
"math/rand"
"os"
"runtime/pprof"
"time"
"github.com/dominikbraun/graph"
)
const (
numberOfVertices = 500
lengthOfChains = 10
profFile = "/tmp/cpu.prof"
)
var letters = []rune("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")
func randSeq(n int) string {
b := make([]rune, n)
for i := range b {
b[i] = letters[rand.Intn(len(letters))]
}
return string(b)
}
func main() {
if numberOfVertices%lengthOfChains != 0 {
panic("numberOfVertices must be divisible by lengthOfChains")
}
stringg10, stringComp10 := stringGraph(10)
stringg20, stringComp20 := stringGraph(20)
intg, intComp := intGraph()
f, err := os.Create(profFile)
if err != nil {
log.Fatal(err)
}
defer f.Close()
pprof.StartCPUProfile(f)
defer pprof.StopCPUProfile()
string10Start := time.Now()
_, _ = graph.StableTopologicalSort(stringg10, stringComp10)
string10Duration := time.Since(string10Start)
string20Start := time.Now()
_, _ = graph.StableTopologicalSort(stringg20, stringComp20)
string20Duration := time.Since(string20Start)
intStart := time.Now()
_, _ = graph.StableTopologicalSort(intg, intComp)
intDuration := time.Since(intStart)
fmt.Printf("number of vertices: %d\n", numberOfVertices)
fmt.Printf("length of chains: %d\n", lengthOfChains)
fmt.Printf("string 10 sort time was %s\n", string10Duration)
fmt.Printf("string 20 sort time was %s\n", string20Duration)
fmt.Printf("int sort time was %s\n", intDuration)
fmt.Printf("saved output profile to %s\n", profFile)
}
func stringGraph(size int) (graph.Graph[string, string], func(a, b string) bool) {
rand.Seed(time.Now().UnixNano())
numberOfRows := numberOfVertices / lengthOfChains
g := graph.New(graph.StringHash, graph.Directed(), graph.Acyclic(), graph.PreventCycles())
var vertices []string
for i := 0; i < numberOfVertices; i++ {
vertex := randSeq(size)
vertices = append(vertices, vertex)
_ = g.AddVertex(vertex)
}
vertex := 0
for i := 0; i < numberOfRows; i++ {
for j := 0; j < lengthOfChains; j++ {
if vertex < len(vertices) - 1 {
if err := g.AddEdge(vertices[vertex], vertices[vertex+1]); err != nil {
panic(err)
}
}
vertex++
}
}
return g, func(a, b string) bool {
return a < b
}
}
func intGraph() (graph.Graph[int, int], func(a, b int) bool) {
numberOfRows := numberOfVertices / lengthOfChains
g := graph.New(graph.IntHash, graph.Directed(), graph.Acyclic(), graph.PreventCycles())
for i := 1; i <= numberOfVertices; i++ {
_ = g.AddVertex(i)
}
vertex := 1
for i := 1; i <= numberOfRows; i++ {
for j := 1; j <= lengthOfChains; j++ {
if j < lengthOfChains {
if err := g.AddEdge(vertex, vertex+1); err != nil {
panic(err)
}
}
vertex++
}
}
return g, func(a, b int) bool {
return a < b
}
} |
FWIW, I tried replacing the Here is part of the updated func TopologicalSort[K comparable, T any](g Graph[K, T]) ([]K, error) {
return StableTopologicalSort(g, nil)
}
func StableTopologicalSort[K comparable, T any](g Graph[K, T], sorter func([]K) sort.Interface) ([]K, error) {
if !g.Traits().IsDirected {
return nil, fmt.Errorf("topological sort cannot be computed on undirected graph")
}
predecessorMap, err := g.PredecessorMap()
if err != nil {
return nil, fmt.Errorf("failed to get predecessor map: %w", err)
}
queue := make([]K, 0)
for vertex, predecessors := range predecessorMap {
if len(predecessors) == 0 {
queue = append(queue, vertex)
}
}
order := make([]K, 0, len(predecessorMap))
visited := make(map[K]struct{})
if sorter != nil {
sort.Stable(sorter(queue))
}
for len(queue) > 0 {
currentVertex := queue[0]
queue = queue[1:]
if _, ok := visited[currentVertex]; ok {
continue
}
order = append(order, currentVertex)
visited[currentVertex] = struct{}{}
for vertex, predecessors := range predecessorMap {
delete(predecessors, currentVertex)
if len(predecessors) == 0 {
queue = append(queue, vertex)
if sorter != nil {
sort.Stable(sorter(queue))
}
}
}
}
// skipping lots of lines
} You can replace In case the updated graphs are helpful, here they are:
Finally, I decided to try the experimental new generic sorting package at https://pkg.go.dev/golang.org/x/exp/slices#SortFunc The signatures for the last one aren't changed at all, just the content of import (
"golang.org/x/exp/slices"
)
// ...
if less != nil {
slices.SortFunc(queue, less)
} instead of These all seem to help. It isn't too bad getting it from 208s to 173-179s, and then down to 158s. For a true performance test, I would need to run each one hundreds of times. I hope this helps. |
graph v0.22.1 reverts the implementation of |
Do we want to pursue any of those |
Congratulations. That's a good reason. Share your LinkedIn profile here for a minute (can delete later), and I'll connect. |
Connection sent. |
The new
ToplogicalSort()
, based on itsStableTopologicalSort()
, has really large performance degradation.A program that used to take 42s, now takes 220-300s (sometimes more), with the vast majority of the time in
StableToplogicalSort()
. The current usage is the OSS wolfictl, but I will try to create a reproduction that is focused just on this.The flame graphs below are from running that with the exact same inputs. The only difference is if I use v0.21.0 or v0.22.0
The first graph is calling
StableToplogicalSort()
with v0.22.0:The second is calling
TopologicalSort()
with v0.22.0 (unchanged, which isn't surprising, as it just wrapsStableTopologicalSort()
:The third is calling
TopologocialSort()
with v0.21.0:In the last, the sort doesn't even take enough time to be bothered into the list.
It might not be the sort itself, as other things might have changed between v0.21.0 and v0.22.0 🤷♂️
The text was updated successfully, but these errors were encountered: