This is one of the more amazing papers I've seen

by atrix - opened 10 days ago

What sort of equipment was used to train the hypernetworks on the models from the examples? How long did it take?

Sakana AI org 7 days ago

It took 5 days on 8 H100 GPUs to train each hypernetwork. You can find the training details in the appendix B of the paper.

Rujikorn changed discussion status to closed 7 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment