-
-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU utilization idea #118
Comments
The buffers are needed because this way one component wouldnt update based on a component that is already updated by an other thread. |
we pondered about GPU driven simulation a bunch on the discord before, the problem is that it'd just be a metric f*ckton of work to implement, without knowing that it would be better. there's a few problems:
So basically: unless someone has a bunch of time and really wants to implement it, it won't happen since it doesn't guarantee higher TPS, and is a lot of work to implement. |
to clarify on the "random memory accesses" thing: a single torch may be activated (well, turned off really) by two different circuits, and those two circuits could be any other random one anywhere in memory. and you usually have hundreds if not thousands of torches in a build. |
Thanks for the feedback! Best luck for further developement on this awesome minecraft server! |
I assume mchprs creates a list of components to be updated every RT, that get updated in the next RT.
This list could be sent to a compute shader where 2000 or more components would update at once.
Probably a better solution would be to create an array of the components as a buffer that gets stored in VRAM, and every component would have an update flag. There would be two buffers with the same components for gpu reasons.
A thousand threads would look at the list in buffer A at the index of their own ID, update that component if needed, and put it into buffer B. Then that thread would increment where it looks in the list by the total thread count until it gets to the end of the list.
When every thread has finished with the list, buffer A and B are swiched and it starts over.
I think this processing style of components could slow down creations where minimal components update in one RT while increasing performance of pipelined cpus for example.
I don't know how multiple plots would get processed like this, but a "single plot" proof of concept version could work with unlimited rtps.
The text was updated successfully, but these errors were encountered: