Good post by Andreas Kling on creating Ladybird, a new web browser from scratch.
OpenAI has a Tokenizer web app to encode text to tokens or count them. Many people use it to count tokens for ChatGPT, however the fact is that it only supports older GPT-3 and Codex models. GPT-3.5 and GPT-4 use a completely different tokenizer, cl100k_base, the canonical encoder for which, tiktoken, is implemented in Rust and available for Python as an extension. However, there’s no web app version of it from OpenAI.
David Duong created a convenient web app called Tiktokenizer which you can use instead.
# Tabloid: The Clickbait Headline Programming Language
Funny! Factorial sample:
YOU WON'T WANT TO MISS 'Hello, World!' DISCOVER HOW TO factorial WITH n RUMOR HAS IT WHAT IF n IS ACTUALLY 0 SHOCKING DEVELOPMENT 1 LIES! SHOCKING DEVELOPMENT n TIMES factorial OF n MINUS 1 END OF STORY EXPERTS CLAIM result TO BE factorial OF 10 YOU WON'T WANT TO MISS 'Result is' YOU WON'T WANT TO MISS result PLEASE LIKE AND SUBSCRIBE
A very compact representation of an image placeholder. Store it inline with your data and show it while the real image is loading for a smoother loading experience.
# Something Pretty Right: A History of Visual Basic
Great article on the rise and fall of Visual Basic.
Run a fast ChatGPT-like model locally on your device.
This combines the LLaMA foundation model with an open reproduction of Stanford Alpaca a fine-tuning of the base model to obey instructions (akin to the RLHF used to train ChatGPT) and a set of modifications to llama.cpp to add a chat interface.
# Port of Facebook’s LLaMA model in C/C++
Impressive. It runs fine on my M1 MacBook Air with 8 GB RAM.
Simon Willison wrote a post on how to use it.
# Fastmail: Announcing Squire 2.0
Squire is our rich text editor that powers the mail composer in Fastmail, providing support for formatting and WYSIWYG text editing in all modern browsers.
Now, there are a good number of rich text editors. However, most have the luxury of strongly limiting what the person can enter to ensure the data model doesn’t break. We can’t use those because we must be able to handle arbitrary HTML because it may be used to forward or quote emails from third parties and must be able to preserve their HTML without breaking the formatting. This is Squire’s advantage.
# GPT in 60 Lines of NumPy
In this post, we’ll implement a GPT from scratch in just 60 lines of numpy. We’ll then load the trained GPT-2 model weights released by OpenAI into our implementation and generate some text.
The result is PicoGPT. Very cool. I’m a fan of simple educational implementations.
# After Alaska Airlines planes bump runway, a scramble to ‘pull the plug’
On the morning of Jan. 26, as two Alaska Airlines flights from Seattle to Hawaii lifted off six minutes apart, the pilots each felt a slight bump and the flight attendants at the back of the cabin heard a scraping noise.
As the noses of both Boeing 737s lifted skyward on takeoff, their tails had scraped the runway.
Caused by an update to the software that calculates thrust and speed settings for takeoff:
That morning, a software bug in an update to the DynamicSource tool caused it to provide seriously undervalued weights for the airplanes.
A concurrency bug?
Peyton added that even though the update to the DynamicSource software had been tested over an extended period, the bug was missed because it only presented when many aircraft at the same time were using the system.
Good procedures overall for noticing, stopping all airplanes, and fixing the system.
The dilemma of safety-critical systems: update the system and suffer new bugs or never update and end up with an and old barely usable system prone to mistakes.