Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: news@zzo38computer.org.invalid Newsgroups: comp.infosystems,comp.protocols.misc Subject: Re: Request for comments: Scorpion protocol/file-format Date: Mon, 08 Apr 2024 16:06:58 -0700 Organization: A noiseless patient Spider Lines: 147 Message-ID: <1712562630.bystand@zzo38computer.org> References: <1712084972.bystand@zzo38computer.org> MIME-Version: 1.0 Injection-Date: Mon, 08 Apr 2024 23:05:34 +0200 (CEST) Injection-Info: dont-email.me; posting-host="d61130d451c1195ee88a544faf7cada8"; logging-data="3980085"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18a2r0s2iY8CuGvw9ZcuWvq" User-Agent: bystand/1.3.0pre1 Cancel-Lock: sha1:kSl+w7cpLGVVt+YrDTwWUPBjLjk= Bytes: 8030 Thank you for your comments. I will try to respond to them the best that I can, and will add whatever is necessary to the FAQ as well as to modify other parts of the document as appropriate. Some of the changes mentioned below I have done; others I have partially done or not added yet. I will continue to work on it later, though. sean@conman.org wrote: > First, what is ULFI? All I bring up when I search on that is "Upper Limb > Functional Index"---I can't seem to locate anything that is close to MIME. > If you do use TLAs [1] and ETLAs [2], please define it somewhere in the > document for those who are unfamiliar with it. Thank you; I will write about it. (In this context, ULFI is short for "Unordered Labels File Identification".) > Second, URL support ... do you expect people to follow RFC-3986? > RFC-3987? Or the WHATWG living specification? RFC 3986. (However, the "hashed:" scheme has its own rules.) > Third: On TLS, methinks you underestimate how difficult it is to check > the first byte of a request is 0x16 and have an existing TLS library take > over the connection if it is. I'm not saying it's impossible, just more > technically difficult than you may think. Have you implemented a server > that supports both TLS and non-TLS support on the same port? I thought you could use recv with the MSG_PEEK flag. (However, I did not actually try that (yet). If I am wrong, then you can tell me what is wrong with that, please.) > Third the second: More TLS---those who like TLS might take offence at > support for non-TLS---an attacker can easily MITM [3] requests to force > non-TLS requests, thus defeating the purpose of TLS in the first place. An implementation may allow the user to configure it to not use non-TLS for some (or all) servers. (This is similar than "HTTPS-Everywhere", but it is not specific to HTTP(S).) Additionally, the client is supposed to display a warning message if a redirect from TLS to non-TLS (or vice-versa) occurs. I think non-TLS has benefits such as improved simplicity and improved energy efficiency. However, sometimes encryption is desirable, so TLS is permitted, too. > Third the third: There will be a subset of people who hate TLS, and > demand that you don't use it, but use some other, possibly bespoke, > encryption system instead. Before taking these people seriously, demand a > proof-of-concept and an analysis by real cryptographers before you engage > with them. It'll save time. I have considered that, and have decided against it (at least for now), for the reasons you specify, and for reasons mentioned in the Gemini FAQ (see section 4.5.3). So, for now, it uses TLS. > Third the fourth: What's with the weird SNI support? The client should > use it, but the server should not? What? Maybe it is unclear. What I mean is that the server shouldn't require SNI since the host name is included in the request anyways. However, possibly SNI might be needed for the server to present the proper certificate to the client; if that is the case, then the server may present an invalid certificate when the wrong (or no) SNI is used. > Third the fifth: What do you mean by "clients SHOULD allow to use the > system's DNS services to implement encrypted Client Hello"? And what's with > the following? "if implemented, there MUST be an option to disable this > feature." Perhaps my specification is unclear. However, I am not sure how to write it more clearly. > Fourth: impose a hard limit on clients following redirects. I know from > experience that if this isn't mandatory, no one will implement it. Even if > it is mandatory, some won't implement it, but hopefully it'll be a smaller > subset who ignore this. OK. I added it. > Fifth: Some server implementor will hard code a 2147483647 on a 4x reply, > which is 69 years. Clients will obviously ignore such a silly request, > leading to an arms race. Don't bother with a timeout value. OK, it is a good point. Even in Gemini protocol they suggested removing the time specification in a 4x reply. > Sixth: For the sub-protocol I, please use BNF for capability codes. And > what's with terminal emulators? OK, I will add that; it is a good idea. > Seventh: The Hashed URI section---what? You first said relative URLs > aren't allowed in a request, so is this meant for documents? What does the > hash buy you here? And why number the hash algorithms instead of just > listing their names? This is getting complicated, quickly. That is correct that relative URLs aren't allowed in a request, although hashed: URLs are not necessarily relative (although they can be). Anyways, it isn't useful to be used in a request (although some servers might allow them in proxied requests (if the URL after the comma is absolute), but this is generally discouraged). Its use is that links to files can specify the hash so that you can verify on the client side that the file has not changed (and that spies have not tampered with it, if the source of the hash is trustworthy). > Eighth: oh, a new document format. Nice. Binary HTML. Even better. > Big endian---I don't mind, but it's not fasionable among kids today (because > Intel won; Motorola lost and get over it Boomer!) and will be complained > about. And by "nice" I mean "oh god!" You'll get people bitching about not > being able to include control data with their favorite editors and besides, > you're redefining well defined control codes. You are NOT going to get > acceptance of this, or the following database file format. The internet is supposed to big-endian, isn't it? Although I think that small-endian is better (independently of what computers use it), I think that it isn't that significant that it is worth violating the convention of internet in this way. (Also, uxn is big-endian.) A text-based format would be much more difficult for the client to parse, to have to handle difficult escaping and nesting and other stuff like that. A binary format will be simpler, especially a "flat" one such as this one, rather than being nested like HTML and XML. There are a few possibilities for how to write the document, such as using a specialized editor, or using a converter or a static site generator. > Ninth: ".special/crawl"? Really? Not "/robots.txt"? Or > "/.wellknown/robots.txt"? Sigh. Even Gemini repurposed "/robots.txt", a > well known and supported format. But if you insist on a new format, perhaps > a example (or four) could be included? I think that there are problems with the robots.txt format, including a possible confusion of what is mandatory and optional. I will add an example because you are correct it is a good idea to do so. (I did not add it yet; sorry. I will do so later.) > Tenth: What is the purpose of ".special/conversion"? What file formats > to what file formats? Any file formats to any file formats. -- Don't laugh at the moon when it is day time in France.