facebook
/

metaclip-2-worldwide-huge-quickgelu

Zero-Shot Image Classification

Model card Files Files and versions

nielsr HF Staff commited on Aug 18

Commit

4fada13

·

verified ·

1 Parent(s): 10f22aa

Update README.md

Files changed (1) hide show

README.md +19 -1

README.md CHANGED Viewed

@@ -24,6 +24,24 @@ pip install -q git+https://github.com/huggingface/transformers.git
 Next you can use it like so:
 ```python
 import requests
 import torch
@@ -38,7 +56,7 @@ url = "http://images.cocodataset.org/val2017/000000039769.jpg"
 image = Image.open(requests.get(url, stream=True).raw)
 labels = ["a photo of a cat", "a photo of a dog", "a photo of a car"]
-inputs = processor(text=labels, images=image, return_tensors="pt", truncation=True, padding="max_length", max_length=77)
 outputs = model(**inputs)
 logits_per_image = outputs.logits_per_image

 Next you can use it like so:
+```python
+import torch
+from transformers import pipeline
+clip = pipeline(
+   task="zero-shot-image-classification",
+   model="facebook/metaclip-2-worldwide-huge-quickgelu",
+   torch_dtype=torch.bfloat16,
+   device=0
+)
+labels = ["a photo of a cat", "a photo of a dog", "a photo of a car"]
+results = clip("http://images.cocodataset.org/val2017/000000039769.jpg", candidate_labels=labels)
+print(results)
+```
+In case you want to perform pre- and postprocessing yourself, you can use the `AutoModel` API:
 ```python
 import requests
 import torch
 image = Image.open(requests.get(url, stream=True).raw)
 labels = ["a photo of a cat", "a photo of a dog", "a photo of a car"]
+inputs = processor(text=labels, images=image, return_tensors="pt", padding=True)
 outputs = model(**inputs)
 logits_per_image = outputs.logits_per_image