Skip to content

different tool for automation process of labeling will be tested here

License

Notifications You must be signed in to change notification settings

HasanGoni/labeling_test

Repository files navigation

labeling_test

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload

This repo code mostly taken from here

Some consideration regarding installation

If you are trying to work in a environment where pytorch is already installed, please make sure to go first setttings.ini file and remove torch from requirements.

Install

git clone [email protected]:HasanGoni/labeling_test.git
cd labeling_test
pip install -e .

In case ssh key is not added for github, then clone this repo using

git clone https://github.com/HasanGoni/labeling_test.git

then

cd labeling_test
pip3 install -e .
## Special considertation regarding `Huggingface`

When transformers model is downloaded, it normally downloaded in local ~/.cache folder. In case you have space scarcity in ~/.cache, you should change the download folder. Can be done in different way. One way could be following.

import os
os.environ['HF_HOME'] = '/home/ai_test/data/huggingface'

here I am changing the download folder to /home/ai_test/data/huggingface`. Make sure the repository exist

Importing libraries

from huggingface_hub import hf_hub_download
from PIL import Image
import cv2
import matplotlib.pyplot as plt
from fastcore.all import *
import torch
import torch.nn.functional as F
from torchvision.transforms.functional import resize, to_pil_image
import numpy as  np
from pathlib import Path
from typing import Tuple, List, Union
from transformers import AutoProcessor, SamModel

How to use

Get data

One consideration regarding reading image and mask. Right now you should use PIL for reading image and opencv for reading mask. I will try to correct it. But right now one shoudl use this way

# Downloading from Niels_rogge hugging face repo
# image file
file_name = hf_hub_download(repo_id="nielsr/persam-dog", filename="dog.jpg", repo_type="dataset")
# image mask file
m_file_name = hf_hub_download(repo_id="nielsr/persam-dog", filename="dog_mask.png", repo_type="dataset")
tst_im_path = hf_hub_download(
    repo_id="nielsr/persam-dog", 
    filename="new_dog.jpg", 
    repo_type="dataset")

ref_image = Image.open(file_name).convert('RGB')
ref_mask = cv2.imread(m_file_name)
ref_mask = cv2.cvtColor(ref_mask, cv2.COLOR_BGR2RGB)
tst_img = Image.open(tst_im_path).convert('RGB')
ref_image

tst_img

getting model

processor = AutoProcessor.from_pretrained("facebook/sam-vit-huge")
# model = PerSamModel.from_pretrained("facebook/sam-vit-huge")
model = SamModel.from_pretrained("facebook/sam-vit-huge")

getting first prediciton mask

#device='cuda' # in case `gpu` is available otherwise use `cpu`
device='cpu'
outputs, tst_feat,topk_xy, topk_label, input_sam, best_idx = get_first_prediction(
                                        ref_img=ref_image,
                                        ref_msk=ref_mask,
                                        tst_img=tst_img,
                                        model=model,
                                        processor=processor, 
                                        device=device,
                                        print_=False)
show_(outputs['pred_masks'].to('cpu').numpy()[0][0][0])

outputs_fr,_, _, _,masks_fr = get_first_refined_mask(
    ref_image=ref_image,
    ref_mask=ref_mask,
    tst_img=tst_img,
    processor=processor,
    model=model,
    print_=False,
    device=device,
     )
show_all_masks(masks_fr, outputs=outputs_fr)

  • So we have got 3 masks from first prediction.
  • For the next step you should tell which one should be used as a prompt input for next prediction.
    • Normally best IoU should give you the best mask, in case it doesn’t provide that, then you should manually decide which mask you want to use.
    • this can be done in next function get_last_refined_masks, where parameter best_idx will control which idx one should use as a prompt.
    • best_idx = 0 means first mask
    • best_idx = None means from above mask use the best IoU.
  • In the above case you can see best IoU = 0.99 is providing me the best mask. Therefore for next prompt I will be using best_idx=None for next input prompt
output_f, masks_f = get_last_refined_masks(
    ref_image=ref_image,
    ref_mask=ref_mask,
    processor=processor,
    model=model,
    tst_img=tst_img,
    device=device,
    print_=False,
    outputs=outputs_fr,
    masks=masks_fr,
    best_idx=None
)
show_all_masks(masks_f, outputs=output_f)

So we have our final masks. we can save the last image. Normally if you use previously best mask idx then actually here 99.99%, you should get the best mask with best IoU

  • in case it is not, then again in save_mask function,index=[0,1 or 2]
path = Path(Path.home(), 'Schreibtisch/projects/data/persam')
save_mask(
    masks_f,
    path=Path(path,'mask_test.png'),
    outputs=output_f,
    index=None
)

Grouding Dino Labeling

Notebook copied from here

First we will see how the model performs on the image

labels=["a dog."]
polygon_refinement=True
threshold=0.3
detector_id="IDEA-Research/grounding-dino-tiny"
segmenter_id="facebook/sam-vit-base"    

file_name = hf_hub_download(
    repo_id="nielsr/persam-dog", 
    filename="dog.jpg", 
    repo_type="dataset")

ref_image = Image.open(file_name).convert('RGB')
# testing together
cv_img, segs = grounding_dino_segmentation(
    image=ref_image,
    labels=["a dog."],
    threshold=threshold,
    polygon_refinement=polygon_refinement,
    detector_id=detector_id,
    segmenter_id=segmenter_id)

annotated_img = get_annotated_img(cv_img, segs)
show_(annotated_img)

get the data

# we will see some other data
image_url = "http://images.cocodataset.org/val2017/000000039769.jpg"
img = load_image(image_url)
img

We need to tell what we want to detect from an image.

  • We have cats and remote control in the image. lets see can we detect them.
labels=["a cat.", "a remote control."]
# refine the mask to get more accurate polygons
polygon_refinement=True
# model threshold , we can adjust this value to get more or less predictions
threshold=0.3
# which model we want to use for object detection
detector_id="IDEA-Research/grounding-dino-tiny"
# which model we want to use for segentation
segmenter_id="facebook/sam-vit-base"    
# just load the image using PIL
img = load_image(image_url)
# testing together
cv_img, segs = grounding_dino_segmentation(
    image=img,
    labels=["a cat.", "a remote control."],
    threshold=threshold,
    polygon_refinement=polygon_refinement,
    detector_id=detector_id,
    segmenter_id=segmenter_id)

We will recive an image and results of the segmentation and detection.

len(segs)
segs[0].box, segs[0].mask
(BoundingBox(xmin=344, ymin=23, xmax=637, ymax=373),
 array([[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]], dtype=uint8))

Now we will visualize the results of the segmentation and bouding box detection.

annotated_img = get_annotated_img(cv_img, segs)
show_(annotated_img)

SegGPT labelling

  • for SegGPT requirement
    • one image prompt
    • subsequent mask prompt

Then it will create segmentation mask of similar object to a test image

Download SegGPT model

from transformers import SegGptImageProcessor, SegGptForImageSegmentation

checkpoint = "BAAI/seggpt-vit-large"
image_processor = SegGptImageProcessor.from_pretrained(checkpoint)
model = SegGptForImageSegmentation.from_pretrained(
    checkpoint)
    #force_download=True,
    #resume_download=False)
ref_image

pil_ref_msk=Image.fromarray(ref_mask)
tst_img

test_image_mask = get_mask(
    image_prompt=ref_image,
    mask_prompt=pil_ref_msk,
    model=model,
    test_img=tst_img,
    show=True
)

show_(test_image_mask)

  • Nice SegGPT nicely works our test image.

About

different tool for automation process of labeling will be tested here

Resources

License

Stars

Watchers

Forks

Languages