This is one of the more amazing papers I've seen
#1
by
atrix - opened
What sort of equipment was used to train the hypernetworks on the models from the examples? How long did it take?
It took 5 days on 8 H100 GPUs to train each hypernetwork. You can find the training details in the appendix B of the paper.
Rujikorn changed discussion status to
closed