From coff at tuhs.org Tue Oct 7 03:39:10 2025 From: coff at tuhs.org (Douglas McIlroy via COFF) Date: Mon, 6 Oct 2025 13:39:10 -0400 Subject: [COFF] [TUHS] Unix gre, forgotten successor to grep (was: forth on early unix) In-Reply-To: <70D71E86-7484-4BB6-AF0C-2FFC1FC9B710@archibald.dev> References: <96A17F58-C1D8-4CA6-BF2F-EABDE17DF02C@archibald.dev> <70D71E86-7484-4BB6-AF0C-2FFC1FC9B710@archibald.dev> Message-ID: Since QED predated Unix, I'm redirecting this to COFF. Ken's CACM article evoked an unusually harsh response in Computing Reviews. The reviewer said roughly that everybody knows one can make a deterministic recognizer that runs in linear time, so why waste our time talking about an NFA recognizer? This moved me to write a letter to the editor of CR in Ken's defense. I pointed out that a deterministic recognizer can have exponentially more states than the number of symbols in the regular expression. This might well overflow memories of the time and in any event would take exponential time to construct and completely wipe out the advantage of linear recognition time. (Al Aho had not yet invented the egrep algorithm, which constructs only the states encountered during recognition.) Computing Reviews did not have a letters section, so, as far as I know, the off-base review still stands unrebutted in the literature. Doug On Mon, Oct 6, 2025 at 12:04 AM Thalia Archibald wrote: > > Ken, > > Your email reminds me of a comment you made in a 1989 interview with Mike > Mahoney, that suggests something earlier than QED: > > > I did a lot of compiling. Even in college and out of college I did a lot of > > on-the-fly compilers. Ah. ah. I wrote a GREP-like program. It would... You > > type in …, you’d say what you wanted it to look for, and a sed-like thing > > also. That you’d say, I want to do a substitute of A for B or some block of > > text. What it would do is compile a program that would look for A and > > substitute in B and then run the compiled program so that one level removed > > from it do I direct my (unclear) and the early languages, the regular > > expression searching stuff in ED and its predecessors on CTSS and those things > > were in fact compilers for searches. They in fact compiled regular... > > https://www.tuhs.org/Archive/Documentation/OralHistory/transcripts/thompson.htm > > By anyone's history of regular expressions, your matcher in QED was the first > software implementation of regular expressions. Was this grep-like program you > wrote in college something earlier than that? Could you share more about it? Do > you somehow still have the source for these? I'd love to study it. > > Thalia > > On Sep 23, 2025, at 11:40, Ken Thompson wrote: > > i think the plan9 grep is the fastest. > > it is grep, egrep, fgrep also. > > i think it is faster than boyer-moore. > > the whole program is a jit dfa > > > > read block > > for c in block > > { > > s=s.state[c] > > if s == nil do something occasionally > > } > > > > it is a very few cycles per byte. all of the > > time is reading a block. i cant imagine b/m > > could be faster. the best b/m could do is > > calculate the skip and then jump over > > bytes that you have already read. > > > > > > russ cox used it to do the (now defunct) code > > search in google. > > From coff at tuhs.org Sat Oct 18 11:44:02 2025 From: coff at tuhs.org (steve jenkin via COFF) Date: Sat, 18 Oct 2025 12:44:02 +1100 Subject: [COFF] [TUHS] To NDEBUG or not to NDEBUG, that is the question In-Reply-To: References: Message-ID: <08014FB9-483A-4ED7-BE5B-BC06D3EA24C6@canb.auug.org.au> This thread, responding to the original, moved to COFF, not about Early Unix. ============================================================ > On 17 Oct 2025, at 22:42, Aharon Robbins via TUHS wrote: > > Now, I can understand why assert() and NDEBUG work the way they do. > Particularly on the small PDP-11s on which C and Unix were developed, > it made sense to have a way to remove assertions from code that would > be installed for all users. How many computing workloads are now CPU limited, and can’t afford run-time Sanity Checking in Userland? For decades, people would try to ‘optimise’ performance by initially writing in assembler [ that myth dealt with by others ]. That appears to have flipped to using huge, slow Frameworks, such as Javascript / ECMA script for ‘Applications’. I’m not advocating “CPU is free, we can afford to forget about optimisation”. That’s OK with prototypes and ‘run once or twice’, human time matters more, but not in high-volume production workloads. The deliberate creation of bloat & wasting resources (== energy & dollars) for production work isn’t Professional behaviour IMHO. 10-15 years ago I saw something about Google’s web server CPU utilisation, being 60%-70%, from memory. It struck me that “% CPU" wasn’t a good metric for throughput anymore, and ’system performance’ was a complex, multi-factored problem, that had to be tuned per workload and target metric for ‘performance’. Low-Latency is only achieved at the cost of throughput. Google may have deliberately opted for lower %CPU to be responsive. Around the same time, there were articles about the throughput increase and latency improvement by some large site moving to SSD’s. IIRC, their CPU utilisation dropped markedly as well. Removing the burden of I/O waits causing deep scheduling queues somehow reduced total kernel overhead. Perhaps fewer VM page faults because of shorter process residency… I’ve no data on modern Supercomputers - I’d expect there to be huge effort in turning resources for individual applications & data sets. There’d be real incentive at the high-end to maximise ‘performance’, as well as at the other end: low-power & embedded systems. I’m more talking about Commercial Off the Shelf and small- to mid-size installations: - the things people run every day and suffer from slow response times. -- Steve Jenkin, IT Systems and Design 0412 786 915 (+61 412 786 915) PO Box 38, Kippax ACT 2615, AUSTRALIA mailto:sjenkin at canb.auug.org.au http://members.tip.net.au/~sjenkin From coff at tuhs.org Sat Oct 18 14:11:03 2025 From: coff at tuhs.org (Lars Brinkhoff via COFF) Date: Sat, 18 Oct 2025 04:11:03 +0000 Subject: [COFF] [TUHS] To NDEBUG or not to NDEBUG, that is the question In-Reply-To: <08014FB9-483A-4ED7-BE5B-BC06D3EA24C6@canb.auug.org.au> (steve jenkin via COFF's message of "Sat, 18 Oct 2025 12:44:02 +1100") References: <08014FB9-483A-4ED7-BE5B-BC06D3EA24C6@canb.auug.org.au> Message-ID: <7wplak3l48.fsf@junk.nocrew.org> Steve Jenkin wrote: > How many computing workloads are now CPU limited, > and can’t afford run-time Sanity Checking in Userland? At my day job we have compiled with -g -O0 from day one, and we are not eager to change. I suppose if the project management starts to worry about CPU load or memory shortage, then we'll turn on the optimizer. We have joked about adding ballast to the application, so we can score an easy win when someone complains it's too big. From coff at tuhs.org Wed Oct 22 07:44:18 2025 From: coff at tuhs.org (Warren Toomey via COFF) Date: Wed, 22 Oct 2025 07:44:18 +1000 Subject: [COFF] B compiler for Linux/macOS Message-ID: Hi all, I got this e-mail from Serge. I asked and he was happy for me to share the e-mail with you. Cheers, Warren ----- Forwarded message from Serge Vakulenko ----- Dear Warren, I hope this email finds you well. Although we've never met in person, my name is Serge. I'm a software developer based in the San Francisco Bay Area, with a deep passion for computer history. A few years ago, I came across the source code for the B compiler, which was reverse-engineered by Robert Swerczek. Intrigued by the challenge of adapting it for contemporary systems, I developed a full-featured B compiler that generates intermediate representation (IR) code for LLVM. This allows it to produce native binaries for Linux or macOS across x86_64, ARM64, and RISC-V architectures. The compiler itself is implemented in Go (approximately 3,000 lines of code), with a lightweight runtime library in C (under 400 lines). I've kept the API as faithful as possible to the original PDP-7 implementation, enabling direct compilation of files like b.b without modifications. Here is the project: https://github.com/sergev/blang Your insightful article on restoring the PDP-7 to run Unix has always inspired me, so I wanted to share this project with you. It's exciting to think that the B language— a foundational piece of computing history— is now accessible to modern developers. Best regards, Serge Vakulenko ----- End forwarded message ----- From coff at tuhs.org Mon Oct 27 15:47:07 2025 From: coff at tuhs.org (segaloco via COFF) Date: Mon, 27 Oct 2025 05:47:07 +0000 Subject: [COFF] Some Famicom/NES Utilities in the UNIX Tradition Message-ID: For those whom this sort of thing may interest, I wanted to take an opportunity to share some tools I've been tinkering on lately as well as a little background about them. The two main sets are at: https://gitlab.com/segaloco/misc/-/tree/master/fc_tools and https://gitlab.com/segaloco/smb3/-/tree/master/tools With the former being tools general to the Famicom/NES and the latter being tools more specific to Super Mario Bros. 3, my disassembly of which has served as the testbed for developing these and other tools. I share these not only due to their general concern of fitting a number of different needs relevant to both development and reverse engineering of NES games, but also due to the influence that the UNIX philosophy has had on the design patterns and decisions I've made. The bulk of these tools act as filters, specifically so that they can be strung together into pipelines as is tradition. Furthermore, where a concern matched close enough with an existing UNIX utility, I used that utility as the interface model for my own. For instance, my ddnes(1) utility derives its argument syntax directly from dd(1) and, similar to how dd(1) abstracts disk blocks and has some basic conversions like ASCII to EBCDIC, allows for specifying the abstract mapping scheme of iNES images being dumped from. This sort of replication of familiar UNIX interfaces as significantly lowered the cognitive load not only of remembering flags and options, but also contemplating the logical structure of pipelined operations. I simply need to do the same thing I would do for a more generic data operation, except I swap in my tools where necessary. In some ways I owe it to TUHS and the larger community surrounding these UNIX history efforts that these tools exist at all. The field of video game reverse engineering is what I cut my teeth on as a tech person, and that field has for such a long time been dominated by the Windows world, graphical applications, complexity and closed-source solutions, and so on. In other words, being a ROM hacker on weird UNIX platforms is a lonely, relatively DIY situation compared with the same in the Microsoft Windows ecosystem. Learning more about UNIX and more importantly the UNIX philosophy through discussions here, historical preservation, studying old manuals and source code, etc. has had an outsize influence then on my desire to produce what, in my obviously biased opinion, is a quite comfortable development and reverse engineering environment for the NES on UNIX. In many ways I was inspired by the same motivation UNIX was developed under, to create a simpler, more intuitive technical environment that avoids the needless complexity of many of the more established toolkits and workflows. Only time will tell if my tools see uptake in the niche communities they concern, but I felt some appreciation was in order for the fact that UNIX impacted my development of this toolkit on so many levels. - Matt G. P.S. If you're someone who tinkers with this sort of stuff and have questions or suggestions, I'm always happy to discuss the finer points. Licenses are provided with the usual disclaimers, know that there isn't much error checking, so don't feed these bad data and redirect output into a precious file without a backup plan. You have been warned these are not hardened for production workloads. From coff at tuhs.org Mon Oct 27 21:10:20 2025 From: coff at tuhs.org (=?utf-8?q?Cameron_M=C3=AD=C4=8Be=C3=A1l_Tyre_via_COFF?=) Date: Mon, 27 Oct 2025 11:10:20 +0000 Subject: [COFF] Some Famicom/NES Utilities in the UNIX Tradition In-Reply-To: References: Message-ID: Hi Matt, Sounds awesome. Wish I'd had tools like that 40+ years ago when I used to reverse engineer Z80 machine code in commercial games to figure out cheats! I used to poke a short routine into unused memory and run that to scan through the game code, searching for likely things such as lives remaining, lose a life etc. It's great to know that people like you are still doing stuff like that. Best regards, Cameron