cascaded norm

#11
by J22 - opened

There is a LayerNorm (post_layernorm) following the last layer of ViT, which is followed by a RMSNorm (ln_q from VLPatchMerger).

Is there any special consideration on cascading two Norms?

Sign up or log in to comment