Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <mailman.7.1727713114.3018.python-list@python.org>
Deutsch   English   Français   Italiano  
<mailman.7.1727713114.3018.python-list@python.org>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!2.eu.feeder.erje.net!3.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!not-for-mail
From: Thomas Passin <list1@tompassin.net>
Newsgroups: comp.lang.python
Subject: Re: Help with Streaming and Chunk Processing for Large JSON Data (60
 GB) from Kenna API
Date: Mon, 30 Sep 2024 12:11:46 -0400
Lines: 18
Message-ID: <mailman.7.1727713114.3018.python-list@python.org>
References: <CADrxXXmHUwsQbWqNrwzyKWLyTK0J3Hf0z8hAhGwKYoF2PwK7QA@mail.gmail.com>
 <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org>
 <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de Up93g91uYpGNbHf5B5mA6ApDwHkrTxxRSys0aqowT7KQ==
Cancel-Lock: sha1:EruH+zTeuyRBB9N/ebP8nILU7SY= sha256:Mc20RVGjWbk0pByvc735OJn96FX2F89MWc3Rjutfvm8=
Return-Path: <list1@tompassin.net>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
 reason="2048-bit key; unprotected key"
 header.d=tompassin.net header.i=@tompassin.net header.b=1J8c6mvR;
 dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.032
X-Spam-Evidence: '*H*': 0.94; '*S*': 0.00; 'subject:API': 0.07;
 'memory.': 0.09; 'import': 0.15; 'barry': 0.16; 'janhangeer':
 0.16; 'received:10.0.0': 0.16; 'received:64.90': 0.16;
 'received:64.90.62': 0.16; 'received:64.90.62.162': 0.16;
 'received:dreamhost.com': 0.16; 'wrote:': 0.16; 'subject:Help':
 0.17; 'to:addr:python-list': 0.20; 'computer': 0.29; 'header:User-
 Agent:1': 0.30; 'whole': 0.30; 'am,': 0.31; 'python-list': 0.32;
 'received:10.0': 0.32; 'received:mailchannels.net': 0.32;
 'received:relay.mailchannels.net': 0.32; 'sep': 0.32; 'unless':
 0.32; 'subject:for': 0.33; 'header:In-Reply-To:1': 0.34;
 'subject:from': 0.37; 'file': 0.38; 'received:100': 0.39; 'once':
 0.63; 'header:Received:6': 0.67; 'received:64': 0.67;
 'subject:Data': 0.71; 'receive': 0.71; 'larger,': 0.84; 'subject:
 \n ': 0.84
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1727712707; a=rsa-sha256;
 cv=none;
 b=NL/KH8NAYh+un+wLiv6ECwoSujLwIT4iajgx6JZRFJAHjTtLOJ2pkM+uqcINmkzbiQHxwD
 hXdBvu0QHNw94mPJD/ER8YEbfMsGPbULZAQT7k6xsKmJD3hYcR6VSQM1UM8W958UGY1Q9N
 +6A9LoQC6E+ziFyRsqULClZpA9Pi69QDNOyeLoR811dQWY4BTIWEjVOzPco4f9LHPjYNHu
 6c7nWhA67qt0V/k4mU3YFzPZdsYbuO1xJIf0ENpawx1ocOFCnmwxyn+BOXzBDAqAksUkYu
 hvn5USB6+mfKTnNzvUxG2HUkj+sf96yz/rFEUYEVu2vxfTWeMECceW6rzuirwQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
 d=mailchannels.net; s=arc-2022; t=1727712707;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references:dkim-signature;
 bh=KuIyJ0d3l32ey4tzvNxh1xYq5vElC8SG5apG088/Aak=;
 b=rAjtTb2PyDMhieYyizPbqKi2Uy7R4SUdhw/YiP+XvBPTj+K26vk3fw4URTHCeDrk1G9Fag
 GrkHhVxQhBA7gb5GpG1v1eHezkX4lE8qumWvbBITjR1Ye5wg7TlhrHGA5v++CehRnP0aIp
 bU78Stx/XeqmzSHUHYWApE0hBLneS2S4rLzt/gzA68Yxl/EWE4ORNlYi9dBdNnJKdXLSJ5
 eDxWQpgMR+50z6MrmJz8I9fhVtA00ab3/B4/HwajUYApbm4KuclW3/La0XEC1sgw5/KKm9
 5i/hR2vHHuPJPTEz5jdibMk7WJXOeD+mza9T9hpX7b+sh7x21sOvA2JCw8YlWQ==
ARC-Authentication-Results: i=1; rspamd-657f47799c-jlm8v;
 auth=pass smtp.auth=dreamhost smtp.mailfrom=list1@tompassin.net
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|tpassin@tompassin.net
X-MailChannels-Auth-Id: dreamhost
X-Obese-Sponge: 57f506783d6654c3_1727712707884_3988401090
X-MC-Loop-Signature: 1727712707884:663450481
X-MC-Ingress-Time: 1727712707884
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tompassin.net;
 s=dreamhost; t=1727712707;
 bh=KuIyJ0d3l32ey4tzvNxh1xYq5vElC8SG5apG088/Aak=;
 h=Date:Subject:To:From:Content-Type:Content-Transfer-Encoding;
 b=1J8c6mvRRLQNqlvHucoHIKvNe5HT9OVdkhXMqAZGMDTLeM0g/g5QVXlBxVkFP6ul5
 MM54Ahv+i6Ym8gfgAWt0bOEsua/UN9zRzdUNHMjcvlluYsON6STwHXVwPdPBIJF4W0
 vpKxc9hxD0Pqoq+6z51NdQ/dn2k8baV77MeOxhsVS1KKNbgFNTlqa8VoLPm02BUDTE
 U09rWsoNQGKSHbHXotmunPk2ubscjNjzR5r3OaaLY+rsyO456wOupiC8VM1VuOxOIP
 puSCu89KiIyWz4+jUCZiOOTG3LLOy1ZVFH3f0Bzyt4S4kX59Lq2IafRA6r6iY23zzE
 UAtIAIQyd7u5w==
User-Agent: Mozilla Thunderbird
Content-Language: en-US
In-Reply-To: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
 <python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
 <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
 <mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net>
X-Mailman-Original-References: <CADrxXXmHUwsQbWqNrwzyKWLyTK0J3Hf0z8hAhGwKYoF2PwK7QA@mail.gmail.com>
 <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org>
Bytes: 5960

On 9/30/2024 11:30 AM, Barry via Python-list wrote:
> 
> 
>> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list <python-list@python.org> wrote:
>>
>>
>> import polars as pl
>> pl.read_json("file.json")
>>
>>
> 
> This is not going to work unless the computer has a lot more the 60GiB of RAM.
> 
> As later suggested a streaming parser is required.

Streaming won't work because the file is gzipped.  You have to receive 
the whole thing before you can unzip it. Once unzipped it will be even 
larger, and all in memory.