Deutsch   English   Français   Italiano  
<mailman.47.1722364504.2981.python-list@python.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!news.mixmin.net!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: marc nicole <mk1853387@gmail.com>
Newsgroups: comp.lang.python
Subject: Predicting an object over an pretrained model is not working
Date: Tue, 30 Jul 2024 20:18:42 +0200
Lines: 230
Message-ID: <mailman.47.1722364504.2981.python-list@python.org>
References: <CAGJtH9Qjv2fQm=_HKwhoGS11uh+u4YoTVzYGHF=2jZC9HpdV9A@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de hFXrEPJBznSC5LfR7iv6+wl3+MfdVB+e2ladIb+r1xvA==
Cancel-Lock: sha1:cxc04WCAlBomnqdGv1RZQHf2LQU= sha256:AbbBslH9VIu7/o8Dn7UvPp5QuQh+03+TPwAs/b5JL68=
Return-Path: <mk1853387@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
 reason="2048-bit key; unprotected key"
 header.d=gmail.com header.i=@gmail.com header.b=inHWGfQc;
 dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.020
X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; 'def': 0.04; 'image.':
 0.07; '"""': 0.09; 'cell.': 0.09; 'code?': 0.09; 'coordinate':
 0.09; 'output:': 0.09; 'skip:x 10': 0.09; 'subject:not': 0.09;
 'tensorflow': 0.09; 'threshold': 0.09; '>': 0.14; '<': 0.16;
 '(1,': 0.16; 'annotated': 0.16; 'args:': 0.16; 'box.': 0.16;
 'dict': 0.16; 'input.': 0.16; 'possible?': 0.16; 'predicts': 0.16;
 'skip:5 20': 0.16; 'subject:model': 0.16; 'subject:working': 0.16;
 'problem': 0.16; 'implement': 0.19; 'to:addr:python-list': 0.20;
 'all,': 0.20; 'input': 0.21; 'skip:_ 10': 0.22; 'code': 0.23;
 'skip:p 30': 0.23; '<pre': 0.26; 'classes': 0.26; 'object': 0.26;
 'else': 0.27; 'expect': 0.28; 'output': 0.28; 'message-
 id:@mail.gmail.com': 0.32; 'skip:2 10': 0.32; 'able': 0.34;
 'skip:" 20': 0.34; 'received:google.com': 0.34; 'skip:2 20': 0.35;
 'following': 0.35; 'from:addr:gmail.com': 0.35; 'fix': 0.36;
 'target': 0.36; 'source': 0.36; "skip:' 10": 0.37; 'using': 0.37;
 'class': 0.37; '8bit%:14': 0.38; 'two': 0.39; 'added': 0.39;
 '(with': 0.39; 'skip:u 20': 0.39; 'processed': 0.40; 'skip:( 30':
 0.40; 'want': 0.40; 'skip:0 20': 0.61; 'skip:o 10': 0.61; 'skip:m
 20': 0.63; 'skip:b 10': 0.63; 'skip:k 10': 0.64; 'key': 0.64;
 'skip:r 20': 0.64; 'box': 0.65; 'skip:t 20': 0.66; 'skip:1 20':
 0.67; 'skip:n 30': 0.67; 'areas': 0.67; 'maximum': 0.67; 'order':
 0.69; 'model.': 0.69; 'prediction': 0.69; 'skip:s 170': 0.69;
 'skip:y 30': 0.69; 'skip:4 10': 0.75; '<span': 0.76; 'skip:y 10':
 0.76; 'detection': 0.76; 'database': 0.80; 'position': 0.81;
 '",': 0.84; 'skip:s 280': 0.84; 'subject:over': 0.84
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1722363554; x=1722968354; darn=python.org;
 h=to:subject:message-id:date:from:mime-version:from:to:cc:subject
 :date:message-id:reply-to;
 bh=2vE//o1tBjaqdl6BqyJxrK6rQz+ykyMMc2qYH5BDEBw=;
 b=inHWGfQcb9aTmsG6O6EcQPf6gaZPJOxN0VVeCB+/5Wa86OUKBq/irUWeroBWy55thz
 NCxnJXl8Lyp7nd4Mb7a1e7Hb9RJR+YYTL7I2N0pPmrkLp4TRSQQCx0bDg+avF2UAIULu
 PF8MH2eh9sja++oUNTQfLQN5fI4nbanc0JnRlx8S5HSLhqM4Tpe0NLOtU9TUXxZIMp5I
 JV7el7mqEz4SasF0E+F//jXtyctSSSjqr4iRUZ/Igza5C9s3Veho965hw6qo6excBd37
 CXWcqebh8TmTaHbTSUo+0JR41R28pVSBOX4nZ1ikCA36KgkpqW/mxz5SE1xIlDIJ5P3S
 7wtA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1722363554; x=1722968354;
 h=to:subject:message-id:date:from:mime-version:x-gm-message-state
 :from:to:cc:subject:date:message-id:reply-to;
 bh=2vE//o1tBjaqdl6BqyJxrK6rQz+ykyMMc2qYH5BDEBw=;
 b=ZN8WqvoWnJZPXkZ+eckocNgXwj73lbAk0Z4j8F6Yry56OPfpNxwCmd97A/bK992vVL
 H0zFhr+RfJomRMiUyboXgR0XosPW/8jmTgfagnNQw3HKIwVFhmmWor/l0hDRMIefSJLq
 GXCFVCocjS0KYGdAX2bR0O0sCRuQZOcxi22kRP0cEMoD7xu/Hrlnz8j5OdZtbbaKsSTb
 iBDrlZGENIwSJQUR4nLRsWJfQ3XQ8WEW1boDopPcl7yt87hKwoeE+kj9rP9tZ3uFeo5R
 F2/22G3Z/8fTNOHBim/VGqsaHXJkwoJhKX+t7+cjuKF959rSFf8si+VJGo5OTewiAJLG
 Fm6w==
X-Gm-Message-State: AOJu0YxwtxtcikJFW8P7VoODrCLOa4TzYOSgpxx7pdmNTYLYiQLKtzzu
 +WdUoPnQgxvIXRKHoDKJZT4EYKnAQQ3UA9E8m673th0j1RyXpF2abYSGXV9zSN7Vc1F5NZ6W3Fq
 ViOAQokKdnJlUNNMGzxha40YWuAKPdVMXzN4=
X-Google-Smtp-Source: AGHT+IEChQOmWi1XTIhMg+h1MGUkjv/rAFKAmdoE84zSx6FSTMwTprxWdvDBW4C2d+j5xMA3s/sJd/Dr3/OzrmstzqU=
X-Received: by 2002:a05:690c:7307:b0:64b:9f5f:67b2 with SMTP id
 00721157ae682-67a09596465mr153768717b3.31.1722363553884; Tue, 30 Jul 2024
 11:19:13 -0700 (PDT)
X-Mailman-Approved-At: Tue, 30 Jul 2024 14:35:03 -0400
X-Content-Filtered-By: Mailman/MimeDel 2.1.39
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
 <python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
 <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
 <mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAGJtH9Qjv2fQm=_HKwhoGS11uh+u4YoTVzYGHF=2jZC9HpdV9A@mail.gmail.com>
Bytes: 14826

Hello all,

I want to predict an object by given as input an image and want to have my
model be able to predict the label. I have trained a model using tensorflow
based on annotated database where the target object to predict was added to
the pretrained model. the code I am using is the following where I set the
target object image as input and want to have the prediction output:








class MultiObjectDetection():

    def __init__(self, classes_name):

        self._classes_name = classes_name
        self._num_classes = len(classes_name)

        self._common_params = {'image_size': 448, 'num_classes':
self._num_classes,
                'batch_size':1}
        self._net_params = {'cell_size': 7, 'boxes_per_cell':2,
'weight_decay': 0.0005}
        self._net = YoloTinyNet(self._common_params, self._net_params,
test=True)

    def predict_object(self, image):
        predicts = self._net.inference(image)
        return predicts

    def process_predicts(self, resized_img, predicts, thresh=0.2):
        """
        process the predicts of object detection with one image input.

        Args:
            resized_img: resized source image.
            predicts: output of the model.
            thresh: thresh of bounding box confidence.
        Return:
            predicts_dict: {"stick": [[x1, y1, x2, y2, scores1], [...]]}.
        """
        cls_num = self._num_classes
        bbx_per_cell = self._net_params["boxes_per_cell"]
        cell_size = self._net_params["cell_size"]
        img_size = self._common_params["image_size"]
        p_classes = predicts[0, :, :, 0:cls_num]
        C = predicts[0, :, :, cls_num:cls_num+bbx_per_cell] # two
bounding boxes in one cell.
        coordinate = predicts[0, :, :, cls_num+bbx_per_cell:] # all
bounding boxes position.

        p_classes = np.reshape(p_classes, (cell_size, cell_size, 1, cls_num))
        C = np.reshape(C, (cell_size, cell_size, bbx_per_cell, 1))

        P = C * p_classes # confidencefor all classes of all bounding
boxes (cell_size, cell_size, bounding_box_num, class_num) = (7, 7, 2,
1).

        predicts_dict = {}
        for i in range(cell_size):
            for j in range(cell_size):
                temp_data = np.zeros_like(P, np.float32)
                temp_data[i, j, :, :] = P[i, j, :, :]
                position = np.argmax(temp_data) # refer to the class
num (with maximum confidence) for every bounding box.
                index = np.unravel_index(position, P.shape)

                if P[index] > thresh:
                    class_num = index[-1]
                    coordinate = np.reshape(coordinate, (cell_size,
cell_size, bbx_per_cell, 4)) # (cell_size, cell_size,
bbox_num_per_cell, coordinate)[xmin, ymin, xmax, ymax]
                    max_coordinate = coordinate[index[0], index[1], index[2], :]

                    xcenter = max_coordinate[0]
                    ycenter = max_coordinate[1]
                    w = max_coordinate[2]
                    h = max_coordinate[3]

                    xcenter = (index[1] + xcenter) * (1.0*img_size /cell_size)
                    ycenter = (index[0] + ycenter) * (1.0*img_size /cell_size)

                    w = w * img_size
                    h = h * img_size
                    xmin = 0 if (xcenter - w/2.0 < 0) else (xcenter - w/2.0)
                    ymin = 0 if (xcenter - w/2.0 < 0) else (ycenter - h/2.0)
                    xmax = resized_img.shape[0] if (xmin + w) >
resized_img.shape[0] else (xmin + w)
                    ymax = resized_img.shape[1] if (ymin + h) >
resized_img.shape[1] else (ymin + h)

                    class_name = self._classes_name[class_num]
                    predicts_dict.setdefault(class_name, [])
                    predicts_dict[class_name].append([int(xmin),
int(ymin), int(xmax), int(ymax), P[index]])

        return predicts_dict

    def non_max_suppress(self, predicts_dict, threshold=0.5):
        """
        implement non-maximum supression on predict bounding boxes.
        Args:
            predicts_dict: {"stick": [[x1, y1, x2, y2, scores1], [...]]}.
            threshhold: iou threshold
        Return:
========== REMAINDER OF ARTICLE TRUNCATED ==========