HuggingFace Introduces New PaliGemma and YOLOv10 Model

Turns out my Idefics2 notebook works just as well for PaliGemma fine-tuning :) find it here: https://t.co/izlqvEArCX For JSON use cases, a tiny VLM might be all you need! Stack consists of @huggingface for the model/PEFT, @LightningAI for training, @weights_biases for logging https://t.co/Ph2PzAwqVI

Xenova@xenovacom

1 mo

YOLOv10: Real-Time End-to-End Object Detection. According to the paper, their models have up to 46% less latency and 25% fewer parameters than YOLOv9, but yield the same performance. 🤯 This makes them perfect for in-browser usage with 🤗 Transformers.js! Models + code below 👇 https://t.co/RUdQSEulcg

Kadir Nar@kadirnar_ai

1 mo

YOLOv10: Real-Time End-to-End Object Detection HuggingFace Demo(Zero A100): https://t.co/fiQwqGXXAk HF Model Page: https://t.co/1HSIOaypwz Colab Demo: https://t.co/BioiprzA0P Thanks @skalskip92 @roboflow @huggingface ❤️ https://t.co/9Motvm90ZO

François Chollet@fchollet

1 mo

Here's a notebook that demonstrates how to use PaliGemma, the new vision-language model in KerasNLP. You can use it for captioning, visual question answering, object detection & segmentation, and even OCR -- all with a single model, on the standard Colab GPU.… https://t.co/6wioxsoWDr

merve@mervenoyann

1 mo

I have fine-tuned pre-trained PaliGemma on visual question answering using @huggingface transformers 🙋‍♀️💬 Notebook to fine-tune and do inference on fine-tuned model with explanations are on the next tweet 🤗 https://t.co/CW62lravop

younes@younesbelkada

1 mo

You are new to @huggingface transformers+quantization? We just refactored the quantization documentation a bit and we made it clearer for users with respect to what features are supported for each quantization method Any feedback appreciated ! https://t.co/wdcl0Nhled https://t.co/i0PTbWRQCA

Similar Stories

HuggingFace Introduces New PaliGemma and YOLOv10 Models for Visual Question Answering and Object Detection

Similar Stories

Sources

HuggingFace Introduces New PaliGemma and YOLOv10 Models for Visual Question Answering and Object Detection