Deutsch English Français Italiano |
<mailman.7.1727713114.3018.python-list@python.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!2.eu.feeder.erje.net!3.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!not-for-mail From: Thomas Passin <list1@tompassin.net> Newsgroups: comp.lang.python Subject: Re: Help with Streaming and Chunk Processing for Large JSON Data (60 GB) from Kenna API Date: Mon, 30 Sep 2024 12:11:46 -0400 Lines: 18 Message-ID: <mailman.7.1727713114.3018.python-list@python.org> References: <CADrxXXmHUwsQbWqNrwzyKWLyTK0J3Hf0z8hAhGwKYoF2PwK7QA@mail.gmail.com> <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: news.uni-berlin.de Up93g91uYpGNbHf5B5mA6ApDwHkrTxxRSys0aqowT7KQ== Cancel-Lock: sha1:EruH+zTeuyRBB9N/ebP8nILU7SY= sha256:Mc20RVGjWbk0pByvc735OJn96FX2F89MWc3Rjutfvm8= Return-Path: <list1@tompassin.net> X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org Authentication-Results: mail.python.org; dkim=pass reason="2048-bit key; unprotected key" header.d=tompassin.net header.i=@tompassin.net header.b=1J8c6mvR; dkim-adsp=pass; dkim-atps=neutral X-Spam-Status: OK 0.032 X-Spam-Evidence: '*H*': 0.94; '*S*': 0.00; 'subject:API': 0.07; 'memory.': 0.09; 'import': 0.15; 'barry': 0.16; 'janhangeer': 0.16; 'received:10.0.0': 0.16; 'received:64.90': 0.16; 'received:64.90.62': 0.16; 'received:64.90.62.162': 0.16; 'received:dreamhost.com': 0.16; 'wrote:': 0.16; 'subject:Help': 0.17; 'to:addr:python-list': 0.20; 'computer': 0.29; 'header:User- Agent:1': 0.30; 'whole': 0.30; 'am,': 0.31; 'python-list': 0.32; 'received:10.0': 0.32; 'received:mailchannels.net': 0.32; 'received:relay.mailchannels.net': 0.32; 'sep': 0.32; 'unless': 0.32; 'subject:for': 0.33; 'header:In-Reply-To:1': 0.34; 'subject:from': 0.37; 'file': 0.38; 'received:100': 0.39; 'once': 0.63; 'header:Received:6': 0.67; 'received:64': 0.67; 'subject:Data': 0.71; 'receive': 0.71; 'larger,': 0.84; 'subject: \n ': 0.84 X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1727712707; a=rsa-sha256; cv=none; b=NL/KH8NAYh+un+wLiv6ECwoSujLwIT4iajgx6JZRFJAHjTtLOJ2pkM+uqcINmkzbiQHxwD hXdBvu0QHNw94mPJD/ER8YEbfMsGPbULZAQT7k6xsKmJD3hYcR6VSQM1UM8W958UGY1Q9N +6A9LoQC6E+ziFyRsqULClZpA9Pi69QDNOyeLoR811dQWY4BTIWEjVOzPco4f9LHPjYNHu 6c7nWhA67qt0V/k4mU3YFzPZdsYbuO1xJIf0ENpawx1ocOFCnmwxyn+BOXzBDAqAksUkYu hvn5USB6+mfKTnNzvUxG2HUkj+sf96yz/rFEUYEVu2vxfTWeMECceW6rzuirwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=mailchannels.net; s=arc-2022; t=1727712707; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KuIyJ0d3l32ey4tzvNxh1xYq5vElC8SG5apG088/Aak=; b=rAjtTb2PyDMhieYyizPbqKi2Uy7R4SUdhw/YiP+XvBPTj+K26vk3fw4URTHCeDrk1G9Fag GrkHhVxQhBA7gb5GpG1v1eHezkX4lE8qumWvbBITjR1Ye5wg7TlhrHGA5v++CehRnP0aIp bU78Stx/XeqmzSHUHYWApE0hBLneS2S4rLzt/gzA68Yxl/EWE4ORNlYi9dBdNnJKdXLSJ5 eDxWQpgMR+50z6MrmJz8I9fhVtA00ab3/B4/HwajUYApbm4KuclW3/La0XEC1sgw5/KKm9 5i/hR2vHHuPJPTEz5jdibMk7WJXOeD+mza9T9hpX7b+sh7x21sOvA2JCw8YlWQ== ARC-Authentication-Results: i=1; rspamd-657f47799c-jlm8v; auth=pass smtp.auth=dreamhost smtp.mailfrom=list1@tompassin.net X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net X-MC-Relay: Neutral X-MailChannels-SenderId: dreamhost|x-authsender|tpassin@tompassin.net X-MailChannels-Auth-Id: dreamhost X-Obese-Sponge: 57f506783d6654c3_1727712707884_3988401090 X-MC-Loop-Signature: 1727712707884:663450481 X-MC-Ingress-Time: 1727712707884 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tompassin.net; s=dreamhost; t=1727712707; bh=KuIyJ0d3l32ey4tzvNxh1xYq5vElC8SG5apG088/Aak=; h=Date:Subject:To:From:Content-Type:Content-Transfer-Encoding; b=1J8c6mvRRLQNqlvHucoHIKvNe5HT9OVdkhXMqAZGMDTLeM0g/g5QVXlBxVkFP6ul5 MM54Ahv+i6Ym8gfgAWt0bOEsua/UN9zRzdUNHMjcvlluYsON6STwHXVwPdPBIJF4W0 vpKxc9hxD0Pqoq+6z51NdQ/dn2k8baV77MeOxhsVS1KKNbgFNTlqa8VoLPm02BUDTE U09rWsoNQGKSHbHXotmunPk2ubscjNjzR5r3OaaLY+rsyO456wOupiC8VM1VuOxOIP puSCu89KiIyWz4+jUCZiOOTG3LLOy1ZVFH3f0Bzyt4S4kX59Lq2IafRA6r6iY23zzE UAtIAIQyd7u5w== User-Agent: Mozilla Thunderbird Content-Language: en-US In-Reply-To: <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.39 Precedence: list List-Id: General discussion list for the Python programming language <python-list.python.org> List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> List-Archive: <https://mail.python.org/pipermail/python-list/> List-Post: <mailto:python-list@python.org> List-Help: <mailto:python-list-request@python.org?subject=help> List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> X-Mailman-Original-Message-ID: <9dfcd123-c31d-4207-869c-d5466487cba4@tompassin.net> X-Mailman-Original-References: <CADrxXXmHUwsQbWqNrwzyKWLyTK0J3Hf0z8hAhGwKYoF2PwK7QA@mail.gmail.com> <082705B5-7C14-4D33-BF38-73F9CB166293@barrys-emacs.org> Bytes: 5960 On 9/30/2024 11:30 AM, Barry via Python-list wrote: > > >> On 30 Sep 2024, at 06:52, Abdur-Rahmaan Janhangeer via Python-list <python-list@python.org> wrote: >> >> >> import polars as pl >> pl.read_json("file.json") >> >> > > This is not going to work unless the computer has a lot more the 60GiB of RAM. > > As later suggested a streaming parser is required. Streaming won't work because the file is gzipped. You have to receive the whole thing before you can unzip it. Once unzipped it will be even larger, and all in memory.