0
$\begingroup$

I have a question regarding labeling images for defect detection using semantic segmentation. If the defect is a negative-space type (something that should be there but not there), how should the label be annotated? My interest is in quantifying the defect area and its relative position for post-processing.

Here are some images to illustrate:

What it should be (ignore the shadow):

What it should be (ignore the shadow)

A defect on one side, some materials are missing but background still blocked by other side:

A defect on one side, some materials are missing but background still blocked by other side

Defect on both sides, some materials missing and background can be seen:

Defect on both sides, some materials missing and background can be seen

How should I label those? I figured 3 combinations:

Label the missing space in reference to a good shape:

Label the missing space in reference to a good shape

Label the edges surrounding the missing space:

Label the edges surrounding the missing space

Label the edges and space:

Label the edges and space

In my opinion, labeling the missing space will provide me with the best data for post-processing since it is a direct data of the missing area, but I am afraid that the model will also label the normal part of the object, or if the defect is on both sides that the background can be seen, the model will label the background as defect area too since the pixels that are labelled are technically "normal" pixels, while the defect is a "spatial relation" (if you only see the image inside the label, it will look normal).

Labeling the edges operates on the "positive" space, which usually have a different "edge shadow" than a normal one, but I will need to extrapolate the area of the defect in post-processing.

Labeling both the edges and area may contain more information in the area, but then the post-processing will be harder since we cannot have a fixed edge width (due to how the edge is tapered) nor orientation (to know where the edges are).

Any insight on how is the "best" approach to do here?

For additional information, I plan to use UNet-type or DeepLab V3+ which are fully convolutional and both have encoder-decoder network. The framework I use is Pytorch.

$\endgroup$

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.