Deutsch English Français Italiano |
<mailman.80.1736963341.2912.python-list@python.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!feeds.phibee-telecom.net!3.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!not-for-mail From: marc nicole <mk1853387@gmail.com> Newsgroups: comp.lang.python Subject: How to weight terms based on semantic importance Date: Wed, 15 Jan 2025 18:40:43 +0100 Lines: 9 Message-ID: <mailman.80.1736963341.2912.python-list@python.org> References: <CAGJtH9TYE-MEqSUHWO-JW5j-d2CtUqet7A_R2fn7A25iScGpFg@mail.gmail.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" X-Trace: news.uni-berlin.de 6WlaTgLXz7kXUHlNiFLrGg3wlU4dSjvvZYbuHze3SSOQ== Cancel-Lock: sha1:KNWsHnZzsCLonk3D12HEGRQE/Rw= sha256:dYTjIOaQozc6f6wOK9uKBHLzJvBSdX4TxdMYru1RLb0= Return-Path: <mk1853387@gmail.com> X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org Authentication-Results: mail.python.org; dkim=pass reason="2048-bit key; unprotected key" header.d=gmail.com header.i=@gmail.com header.b=DFKoolsY; dkim-adsp=pass; dkim-atps=neutral X-Spam-Status: OK 0.192 X-Spam-Level: * X-Spam-Evidence: '*H*': 0.65; '*S*': 0.03; 'example:': 0.09; 'nltk': 0.16; 'semantics': 0.16; 'weights': 0.16; 'to:addr:python-list': 0.20; 'to:no real name:2**1': 0.22; 'sfxlen:2': 0.31; 'subject:How': 0.31; 'message-id:@mail.gmail.com': 0.31; 'there': 0.33; 'received:google.com': 0.34; 'from:addr:gmail.com': 0.34; 'using': 0.37; 'others': 0.37; 'way': 0.38; 'thanks': 0.39; 'hello,': 0.39; 'text': 0.39; 'want': 0.40; 'terms': 0.69; 'weight': 0.84; 'frequency': 0.84; 'subject:based': 0.84; 'etc...': 0.91 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1736963338; x=1737568138; darn=python.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=7wtmvXlypr2Nc0m8v3GwAbjYUl/Fy9npQKP4sIb/Xfg=; b=DFKoolsYmRVaY464vt3MHs1MO82Yuqb+sEkiclSg5R9qt6XzhzHtvhLWKj3pI/DnU9 dt3ygjMo7SuQfAxCtIA8N2+ARLLOt9gLeCeqZPvImZFRrf0c80gRgbJlzOEtnZZeNRZ+ WRUlTWlMgUxpa89gWteYquHEAEca+93cF53dFh9sLbCAN3u4G2WtN17yL7YGjWqVcWHe dnGkOhEuUuRKazD1nGe0K17QBde6SOGZngw69RFjL13tDJczwFYrTpaGPR9YJakQaG/m 4TRhxNH7cyG7+0CXYy2xrSxBSf1/8mAM//RaxqmAjymR8dCzXOqAhc+t+0foMcO0tMz2 oqdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736963338; x=1737568138; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=7wtmvXlypr2Nc0m8v3GwAbjYUl/Fy9npQKP4sIb/Xfg=; b=vF22XgZE8/Bc/yIbLvNX+hc33RhhHtGzOmmYbLxtdOmhD4DlyqA9X8cqP+/4UBKZZH lGmIoawyBtv6w/tu+YG2zCwtcAqauAA3T9KFyQVlxE3UgHUE7btjG7CjlkkhYpwU3mph YWbCsL7w1Q5IE29FCuDzetABBeWyovr27BOU66ap1hDH2pj+dUeR6MEdLAFHRbI4Rq2O r2bmMVAVrD5U5mL/r5gpOYor+XsQoyVh3xGs/v4C6eNKJx5pyJFtPYV9EV0SeRkqKl3T gjX8TtSDT6ghDv6BzgGpMIDdLY0RnCN6XARj67X/PX+kQe3k9Ldpd2PT+mJfisbXj+Va kKow== X-Forwarded-Encrypted: i=1; AJvYcCW5Je0D88zxST8EZeKbuoeiPplZONvicQEwFJ3nH20czYD7/zN65Nciy7WN8LRl93rX35UnCIaYt/c6hA==@python.org X-Gm-Message-State: AOJu0YxXYIw0xj2kIhLNnC+gxStBxHYV42FQQTYCjXL+kdXwHf2O1S8m huXqzZGgM5sfD4N/v+d9gAr3+AK/HQEWPj3+EtpZOKZoQqV/uP/BNSbHY8mkBjWiPBJZ8QTGMXN cI8i3Qeep7NGOJWkzs/zcfhIY6zdePDyx X-Gm-Gg: ASbGncv365Xu2F9n6/4v5AlJ1YjuUjfe88THY3cS1V8rp4Zeb3y2YfZU8uJrqdEBYPN NQaOANugL6fI21TERnOLU1hJwa0e4cd+s3XoYen/R X-Google-Smtp-Source: AGHT+IGTOo1No/bshzDoIRm95gEpBeFpwaecd7vp69hf/BiYqTUhM1RGwgN0dIjb7HwF9l874JiVe69avmXhG4YLJfQ= X-Received: by 2002:a05:690c:6b11:b0:6e2:fcb5:52fa with SMTP id 00721157ae682-6f6c9b20b7amr31334167b3.9.1736962854560; Wed, 15 Jan 2025 09:40:54 -0800 (PST) X-Gm-Features: AbW1kvaTLhxFjIwId_ToFLGXls3fxyAnjABoKsPivKbiuKnoPCe-1XvgwLVt9DI X-Content-Filtered-By: Mailman/MimeDel 2.1.39 X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: General discussion list for the Python programming language <python-list.python.org> List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> List-Archive: <https://mail.python.org/pipermail/python-list/> List-Post: <mailto:python-list@python.org> List-Help: <mailto:python-list-request@python.org?subject=help> List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> X-Mailman-Original-Message-ID: <CAGJtH9TYE-MEqSUHWO-JW5j-d2CtUqet7A_R2fn7A25iScGpFg@mail.gmail.com> Bytes: 4719 Hello, I want to weight terms of a large text based on their semantics (not on their frequency (TF-IDF)). Is there a way to do that using NLTK or other means? through a vectorizer? For example: a certain term weights more than others etc... Thanks