How to use google/pix2struct-docvqa-large with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("visual-question-answering", model="google/pix2struct-docvqa-large")
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("google/pix2struct-docvqa-large") model = AutoModelForImageTextToText.from_pretrained("google/pix2struct-docvqa-large")
how to finetune it,how can I set labels and inpu ids
Check this outhttps://github.com/NielsRogge/Transformers-Tutorials/blob/master/Pix2Struct/Fine_tune_Pix2Struct_on_key_value_pair_dataset_(PyTorch_Lightning).ipynb
@pathikg Is there any notebook available to fine tune Pix2Struct on DocVQA?
· Sign up or log in to comment