Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <mailman.27.1727877147.3018.python-list@python.org>
Deutsch   English   Français   Italiano  
<mailman.27.1727877147.3018.python-list@python.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!fu-berlin.de!uni-berlin.de!not-for-mail
From: Left Right <olegsivokon@gmail.com>
Newsgroups: comp.lang.python
Subject: Re: Help with Streaming and Chunk Processing for Large JSON Data (60
 GB) from Kenna API
Date: Wed, 2 Oct 2024 08:05:02 +0200
Lines: 19
Message-ID: <mailman.27.1727877147.3018.python-list@python.org>
References: <CADrxXXmHUwsQbWqNrwzyKWLyTK0J3Hf0z8hAhGwKYoF2PwK7QA@mail.gmail.com>
 <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org>
 <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>
 <CAJQBtgkLVyNK+vw4u3bFCFEQDH8T3rpyTL+ERyyYHZJskQR6PQ@mail.gmail.com>
 <CAJQBtgnpNkpg-mF2yFCS4P4GYAYsKQ9nEw3Xygja=SE3-=N2Dw@mail.gmail.com>
 <mailman.19.1727796506.3018.python-list@python.org>
 <lm391bFu38hU1@mid.individual.net>
 <CAJQBtgmZehSeBu0y73ALdVq00LHi-R_KKS893FwJkEjkLnsXtA@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de q+tFNDFgoQTqoxUgrsCSjALQDzccW1ODqqfKzcMCgFnQ==
Cancel-Lock: sha1:8nJ4utvnYbIpCLygjPqRl/kK3Z0= sha256:SKuPVMRfJvze8CnY4XrAA955yeD6EKKZHqhQv0aEzvs=
Return-Path: <olegsivokon@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
 reason="2048-bit key; unprotected key"
 header.d=gmail.com header.i=@gmail.com header.b=R76SfGtL;
 dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.044
X-Spam-Evidence: '*H*': 0.91; '*S*': 0.00; 'class.': 0.07;
 'subject:API': 0.07; 'cc:addr:python-list': 0.09; 'json': 0.09;
 'theory': 0.09; 'typically': 0.09; 'cc:no real name:2**0': 0.14;
 'entirety': 0.16; 'hand,': 0.16; 'parsing': 0.16; 'practice,':
 0.16; 'received:mail-qv1-xf2e.google.com': 0.16; 'subject:Help':
 0.17; 'figure': 0.19; 'cc:addr:python.org': 0.20; 'languages':
 0.22; 'examples': 0.25; 'stuff': 0.25; 'cannot': 0.25; 'cc:2**0':
 0.25; 'output': 0.28; "doesn't": 0.32; 'words,': 0.32; 'message-
 id:@mail.gmail.com': 0.32; 'but': 0.32; "i'm": 0.33;
 'subject:for': 0.33; 'there': 0.33; 'able': 0.34; 'same': 0.34;
 'mean': 0.34; 'header:In-Reply-To:1': 0.34; 'received:google.com':
 0.34; 'from:addr:gmail.com': 0.35; 'cases': 0.36; 'subject:from':
 0.37; "it's": 0.37; 'though': 0.37; 'read': 0.38; 'hand': 0.40;
 'something': 0.40; 'want': 0.40; 'should': 0.40; 'sorry': 0.60;
 'gave': 0.61; 'come': 0.62; 'ever': 0.63; 'email': 0.63;
 'everything': 0.63; "you'd": 0.64; 'definition': 0.64; 'well':
 0.65; 'exactly': 0.68; 'and,': 0.69; 'piece': 0.69;
 'subject:Data': 0.71; 'study': 0.82; 'subject: \n ': 0.84
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=gmail.com; s=20230601; t=1727849114; x=1728453914; darn=python.org;
 h=cc:to:subject:message-id:date:from:in-reply-to:references
 :mime-version:from:to:cc:subject:date:message-id:reply-to;
 bh=+41u2NIzn2+NBc+TmUWPhFuWQIkiMECqgtcmhEmn9qc=;
 b=R76SfGtLeK2/+8iX72n/G8mh0z92kMns9YSKncJ2IDqgeXh8e4wGaKS+D82KKMNw3A
 tROiT8TZJvE3FirMivlppsPbGEz3qxrsobMi9FW1DLei4s7m0dLgKIAm7sjWtjLGp3wg
 zxgy9o+4VHwk1nnxzJglsooDsW+n3oCW7pXejf30s8aoy3sw+JaibROrBfWzKy/P5mc8
 pEkQWbAt1vNolueyWSB9mmXTuqV/+/15t2lwAqg81seq4GBfQ97b7gDueXrZmWKQIR9Z
 Us/OlWz0iHqPaOA65dqCMFdcdNKZ7F5ji32bfhNFxjDmTik19HKKfkVLzJF6WSHCQPok
 MQcw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20230601; t=1727849114; x=1728453914;
 h=cc:to:subject:message-id:date:from:in-reply-to:references
 :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
 :reply-to;
 bh=+41u2NIzn2+NBc+TmUWPhFuWQIkiMECqgtcmhEmn9qc=;
 b=Aub+DR5mg69VwOiDofWqlHI+e0XlNSjMrNB1dAZWKWcbLhyGvJosweA4kDOoqz6NwY
 Zymxb2j8qIhlS5T7Yq5/UWP8V/GxpU74utXm75pD2jKyXLWWvfWNCgNXV8d0y1nKXQcY
 jC3g3rId1OcpNuz9Ihcg89Q6qJP7olndQajkDU8IjEWYKH/AmR0Y/FKtrF7N/AI7mkqC
 8oSxmVs16JaZunwa4RF4JQMgI04mqiLNbr2P8cPhyl5nfssy+KfPBjJFCrfCuQTtnc+F
 x4xYyqzhqRPvoM28ou9lqvtFjqV65tNUves72eTV3M9fNhg0Zdjy46IGWqP/Q/GeYW7o
 4QYw==
X-Gm-Message-State: AOJu0Yw/E+c9XsY3UPp161CfQ4djOwALKhZydescgMimtnI0RQpAY/Dl
 xYamAKcYnlH9EE4BP9A7ErytXONJZcWAK/HjrJh2BzVJTOxHv5+6KlDVmYgXcAhuTyRVCjMffbs
 0UwIf5DAygu+UOL9CqpxnIzlum3k=
X-Google-Smtp-Source: AGHT+IEmbJTTqxiL+2zAC5FalLCONqzD9x7KpiSW8CUpsaGj4dXFWBwIl3DuiX0oWrCaPQIf/Ahmht2MNxM0dabX5sI=
X-Received: by 2002:a05:6214:5503:b0:6cb:4c23:6576 with SMTP id
 6a1803df08f44-6cb81a62007mr26325716d6.37.1727849114065; Tue, 01 Oct 2024
 23:05:14 -0700 (PDT)
In-Reply-To: <lm391bFu38hU1@mid.individual.net>
X-Mailman-Approved-At: Wed, 02 Oct 2024 09:52:25 -0400
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
 <python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
 <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
 <mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAJQBtgmZehSeBu0y73ALdVq00LHi-R_KKS893FwJkEjkLnsXtA@mail.gmail.com>
X-Mailman-Original-References: <CADrxXXmHUwsQbWqNrwzyKWLyTK0J3Hf0z8hAhGwKYoF2PwK7QA@mail.gmail.com>
 <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org>
 <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>
 <CAJQBtgkLVyNK+vw4u3bFCFEQDH8T3rpyTL+ERyyYHZJskQR6PQ@mail.gmail.com>
 <CAJQBtgnpNkpg-mF2yFCS4P4GYAYsKQ9nEw3Xygja=SE3-=N2Dw@mail.gmail.com>
 <mailman.19.1727796506.3018.python-list@python.org>
 <lm391bFu38hU1@mid.individual.net>
Bytes: 6724

> By that definition of "streaming", no parser can ever be streaming,
> because there will be some constructs that must be read in their
> entirety before a suitably-structured piece of output can be
> emitted.

In the same email you replied to, I gave examples of languages for
which parsers can be streaming (in general): SCSI or IP. For some
languages (eg. everything in the context-free family) streaming
parsers are _in general_ impossible, because there are pathological
cases like the one with parsing numbers. But this doesn't mean that
you cannot come up with a parser that is only useful _sometimes_.
And, in practice, languages like XML or JSON do well with streaming,
even though in general it's impossible.

I'm sorry if this comes as a surprise.  On one hand I don't want to
sound condescending, on the other hand, this is something that you'd
typically study in automata theory class.  Well, not exactly in the
very same words, but you should be able to figure this stuff out if
you had that class.