Deutsch   English   Français   Italiano  
<iWGdndu1WIJLe3T4nZ2dnZfqnPcAAAAA@giganews.com>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: Xl.tags.giganews.com!local-2.nntp.ord.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Thu, 07 Mar 2024 16:09:58 +0000
Subject: Re: Meta: a usenet server just for sci.math
Newsgroups: sci.math
References: <8f7c0783-39dd-4f48-99bf-f1cf53b17dd9@googlegroups.com>
 <a34856ad-4487-42d0-8b3a-397eec3a46dc@googlegroups.com>
 <1b50e6d3-2e7c-41eb-9324-e91925024f90o@googlegroups.com>
 <31663ae2-a6a2-44b8-9aa3-9f0d16d24d79o@googlegroups.com>
 <6eedc16b-2c82-4aaf-a338-92aba2360ba2n@googlegroups.com>
 <51605ff6-f18f-48c5-8e83-0397632556aen@googlegroups.com>
 <b0c4589a-f222-457e-95b3-437c0721c2a2n@googlegroups.com>
 <5a48e832-3573-4c33-b9cb-d112f01b733bn@googlegroups.com>
 <8wWdnVqZk54j3Fj4nZ2dnZfqnPGdnZ2d@giganews.com>
 <MY-cnRuWkPoIhFr4nZ2dnZfqnPSdnZ2d@giganews.com>
 <NqqdnbEz-KTJTlr4nZ2dnZfqnPudnZ2d@giganews.com>
 <FqOcnYWdRfEI2lT4nZ2dnZfqn_SdnZ2d@giganews.com>
 <NVudnVAqkJ0Sk1D4nZ2dnZfqn_idnZ2d@giganews.com>
 <RuKdnfj4NM2rlkz4nZ2dnZfqn_qdnZ2d@giganews.com>
 <HfCdnROSvfir-E_4nZ2dnZfqnPWdnZ2d@giganews.com>
 <FLicnRkOg7SrWU_4nZ2dnZfqnPadnZ2d@giganews.com>
 <v7ecnUsYY7bW40j4nZ2dnZfqnPudnZ2d@giganews.com>
 <q7-dnR2O9OsAAH74nZ2dnZfqnPhg4p2d@giganews.com>
 <QrWdnaIk98Ulgnv4nZ2dnZfqnPVi4p2d@giganews.com>
From: Ross Finlayson <ross.a.finlayson@gmail.com>
Date: Thu, 7 Mar 2024 08:10:01 -0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Thunderbird/38.6.0
MIME-Version: 1.0
In-Reply-To: <QrWdnaIk98Ulgnv4nZ2dnZfqnPVi4p2d@giganews.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Message-ID: <iWGdndu1WIJLe3T4nZ2dnZfqnPcAAAAA@giganews.com>
Lines: 609
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-xW7OiZQGMMK5TH5Q0NweYvVGtrrgPVbXXfRwA8GJlZoltzLMTYkMcFYRaT3kBnbMjJ/mcYsPVtTR8fY!SSYZVvhRJS6vxB+ZHYKjCBgGkEmPAW08GYC69yy+YIkUO9ZjjOOfFjfmCU5L3x0BoJVJSom23Idw
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
Bytes: 25294

On 03/04/2024 11:23 AM, Ross Finlayson wrote:
>
> So, figuring that BFF then is about designed,
> basically for storing Internet messages with
> regards to MessageId, then about ContentId
> and external resources separately, then here
> the idea again becomes how to make for
> the SFF files, what results, intermediate, tractable,
> derivable, discardable, composable data structures,
> in files of a format with regards to write-once-read-many,
> write-once-read-never, and, "partition it", in terms of
> natural partitions like time intervals and categorical attributes.
>
>
> There are some various great open-source search
> engines, here with respect to something like Lucene
> or SOLR or ElasticSearch.
>
> The idea is that there are attributes searches,
> and full-text searches, those resulting hits,
> to documents apiece, or sections of their content,
> then backward along their attributes, like
> threads and related threads, and authors and
> their cliques, while across groups and periods
> of time.
>
> There's not much of a notion of "semantic search",
> though, it's expected to sort of naturally result,
> here as for usually enough least distance, as for
> "the terms of matching", and predicates from what
> results a filter predicate, here with what I call,
> "Yes/No/Maybe".
>
> Now, what is, "yes/no/maybe", one might ask.
> Well, it's the query specification, of the world
> of results, to filter to the specified results.
> The idea is that there's an accepter network
> for "Yes" and a rejector network for "No"
> and an accepter network for "Maybe" and
> then rest are rejected.
>
> The idea is that the search, is a combination
> of a bunch of yes/no/maybe terms, or,
> sure/no/yes, to indicate what's definitely
> included, what's not, and what is, then that
> the term, results that it's composable, from
> sorting the terms, to result a filter predicate
> implementation, that can run anywhere along
> the way, from the backend to the frontend,
> this way being a, "search query specification".
>
>
> There are notions like, "*", and single match
> and multimatch, about basically columns and
> a column model, of documents, that are
> basically rows.
>
>
> The idea of course is to built an arithmetic expression,
> that also is exactly a natural expression,
> for "matches", and "ranges".
>
> "AP"|Archimedes|Plutonium in first|last
>
> Here, there is a search, for various names, that
> it composes this way.
>
> AP first
> AP last
> Archimedes first
> Archimedes last
> Plutonium first
> Plutonium last
>
> As you can see, these "match terms", just naturally
> break out, then that what's gets into negations,
> break out and double, and what gets into ranges,
> then, well that involves for partitions and ranges,
> duplicating and breaking that out.
>
> It results though a very fungible and normal form
> of a search query specification, that rebuilds the
> filter predicate according to sorting those, then
> has very well understood runtime according to
> yes/no/maybe and the multimatch, across and
> among multiple attributes, multiple terms.
>
>
> This sort of enriches a usual sort of query
> "exact full hit", with this sort "ranges and conditions,
> exact full hits".
>
> So, the Yes/No/Maybe, is the generic search query
> specification, overall, just reflecting an accepter/rejector
> network, with a bit on the front to reflect keep/toss,
> that's it's very practical and of course totally commonplace
> and easily written broken out as find or wildmat specs.
>
> For then these the objects and the terms relating
> the things, there's about maintaining this, while
> refining it, that basically there's an ownership
> and a reference count of the filter objects, so
> that various controls according to the syntax of
> the normal form of the expression itself, with
> most usual English terms like "is" and "in" and
> "has" and "between", and "not", with & for "and"
> and | for "or", makes that this should be the kind
> of filter query specification that one would expect
> to be general purpose on all such manners of
> filter query specifications and their controls.
>
> So, a normal form for these filter objects, then
> gets relating them to the SFF files, because, an
> SFF file of a given input corpus, satisifies some
> of these specifications, the queries, or for example
> doesn't, about making the language and files
> first of the query, then the content, then just
> mapping those to the content, which are built
> off extractors and summarizers.
>
> I already thought about this a lot. It results
> that it sort of has its own little theory,
> thus what can result its own little normal forms,
> for making a fungible SFF description, what
> results for any query, going through those,
> running the same query or as so filtered down
> the query for the partition already, from the
> front-end to the back-end and back, a little
> noisy protocol, that delivers search results.
>
>
>
>
> The document is element of the corpus.
> Here each message is a corpus. Now,
> there's a convention in Internet messages,
> not always followed, being that the ignorant
> or lacking etiquette or just plain different,
> don't follow it or break it, there's a convention
> of attribution in Internet messages the
> content that's replied to, and, this is
> variously "block" or "inline".
>
>  From the outside though, the document here
> has the "overview" attributes, the key-value
> pairs of the headers those being, and the
> "body" or "document" itself, which can as
> well have extracted attributes, vis-a-vis
> otherwise its, "full text".
>
> https://en.wikipedia.org/wiki/Search_engine_indexing
>
>
> The key thing here for partitioning is to
> make for date-range partitioning, while,
> the organization of the messages by ID is
> essentially flat, and constant rate to access one
> but linear to trawl through them, although parallelizable,
========== REMAINDER OF ARTICLE TRUNCATED ==========