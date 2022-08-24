



Enlarge / Brian Kernighan speaking at a 2012 tribute to Bell Labs colleague and C programming language co-author Dennis Ritchie. Ritchie’s face in the dominoes is behind Kernighan.

A Princeton professor, finding some time for himself during the summer college lull, emailed an old friend a few months ago. Brian Kernighan said hello, asked how their visit to the United States was going, and dropped hundreds of lines of code that could add Unicode support for AWK, the text analysis tool he helped create for Unix at Bell Labs in 1977.

“I’ve tested this extensively, but clearly more testing is needed,” Kernighan wrote in the email, posted in late May as a kind of pseudo-pledge on the onetrueawk repo by the longtime maintainer. dated Arnold Robbins. “Once I figure out how…I’ll try to submit a pull request. I’d like to understand git better, but despite your help, I still don’t have a good understanding, so this may take a while. time.”

Kernighan is the “K” in AWK, a specialized language for extracting and manipulating the language that was key to Unix’s pipeline functionality and interoperability between systems. A working awk function (AWK is the language, awk the command to invoke it) is essential to both the standard UNIX specification and the IEEE POSIX certification for interoperability. There are countless variations of awk, including modern derivations supporting Unicode, but “One True AWK”, sometimes known as nawk, is a kind of canonical version based on Kernighan’s 1985 book, The AWK Programming Language, and its later contributions.

Copies of the C programming language in their native campus library environment, written by Brian Kernighan and Dennis Ritchie (RIP).

Kernighan is also the “K” in “K&R C,” the seminal 1978 book The C Programming Language he co-authored with Dennis Ritchie that sticks with programmers, mentally and in dog-eared paper form. The roots of C are much deeper. Kernighan had taught C to Bell Labs workers and convinced its creator, Ritchie, to collaborate on a book to spread the knowledge. This book gave birth to the “one true brace style,” the endless debate that goes with it, and the structure that underpins any modern programming language.

Kernighan also named Unix and first demonstrated the “Hello, world” sample code. He spoke with Richard Jensen of Ars Technica for a 50th anniversary history of Unix.

The onetrueawk repository, where Kernighan appeared in late May, is a relatively quiet place, with 21 contributors, 46 GitHub users watching, and commits coming every few months. As noted by The Register, Kernighan’s Unicode patch came to light mainly because it was mentioned in an interview with the professor by YouTube channel Computerphile.

“It’s always been annoying that AWK only works with ASCII input, or maybe 8-bit, but it doesn’t support Unicode at all,” Kernighan told Professor David Brailsford. “A few months ago I spent some time working with (laughs) an incredibly old program. I have it at this stage where it will actually handle UTF-8 input and output so you can have regular expressions that, you know, match Japanese characters, things like that.”

Kernighan, now 80, casually mentions in the interview that he also fixed something “quick and dirty” to let AWK handle CSV files.

