Hacker News

Latest

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

2026-02-06 @ 22:10:13Points: 15Comments: 1

Early Christian Writings

2026-02-06 @ 22:00:46Points: 61Comments: 17

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

2026-02-06 @ 21:51:23Points: 160Comments: 25

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

2026-02-06 @ 21:33:11Points: 46Comments: 6

https://github.com/valdanylchuk/breezydemo

The underlying ESP-IDF component: https://github.com/valdanylchuk/breezybox

It is something like Raspberry Pi, but without the overhead of a full server-grade OS.

It captures a lot of the old school DOS era coding experience. I created a custom fast text mode driver, plan to add VGA-like graphics next. ANSI text demos run smooth, as you can see in the demo video featured in the Readme.

App installs also work smoothly. The first time it installed 6 apps from my git repo with one command, felt like, "OMG, I got homebrew to run on a toaster!" And best of all, it can install from any repo, no approvals or waiting, you just publish a compatible ELF file in your release.

Coverage:

Hackaday: https://hackaday.com/2026/02/06/breezybox-a-busybox-like-she...

Hackster.io: https://www.hackster.io/news/valentyn-danylchuk-s-breezybox-...

Reddit: https://www.reddit.com/r/esp32/comments/1qq503c/i_made_an_in...

Monty: A minimal, secure Python interpreter written in Rust for use by AI

2026-02-06 @ 21:16:36Points: 18Comments: 4

Tell HN: I'm a PM at a big system of record SaaS. We're cooked

2026-02-06 @ 20:43:57Points: 81Comments: 25

BigCo SoRs, differences aside, have historically been a good, low-drama way to make a living in tech. RSUs, ~40-hour weeks, generally smart colleagues, and real problems to solve for F100 customers. Our products work, but are not loved. Enterprise sales runs the show.

I have no concerns about a scrappy AI startup or indie dev replacing us. The real threat is other SoR vendors, the cloud providers, and of course the AI labs themselves. All of them are coming for our SaaS margins, and as an industry we are woefully unprepared.

Every major SoR has its core competency (HR, ERP, CRM, etc.), but also a long tail of lesser-known portfolio products that increasingly overlap with other SoRs and serve as growth vectors. The competition here is only going to accelerate. As a huge enterprise, you’re not going to rip out a component your SoR for a cool startup or a vibe-coded internal tool... but you would seriously consider doing so if the alternative comes from another SoR vendor you use and is cheaper.

The public cloud providers are explicitly positioning themselves as the place where your business data, AI agents/LLMs, and critical applications live. This is on a direct collision course with SoRs’ own AI platform ambitions that they are banking on for growth.

The AI labs themselves have the same ambition. Note where systems of record sit in OpenAI’s Frontier press release marketecture: a dotted, nearly invisible line at the bottom [2].

SoRs aren’t dead, and they’re not being disrupted by vibe coders. But the path forward is brutal.

Which brings me to the hardest point that applies to me as well. SoR teams are not known for fast execution, cutting edge AI adoption, product taste, or engineering excellence. These are exactly the strengths of our new competitors. We also struggle to attract this kind of talent. People who fit that profile go to FAANG or the labs. We could try to compete with RSUs, but those are down ~50% over the past few months, and the industry is under increasing pressure from investors around stock-based comp and M&A in general.

The goal here is an honest take from someone on the inside. There’s a difficult road ahead. I think SoRs will always continue to exist in some form but I don’t think the recent market corrections are overblown.

[1] https://news.ycombinator.com/item?id=46888441 [2] https://openai.com/index/introducing-openai-frontier/

Masked namespace vulnerability in Temporal

2026-02-06 @ 20:04:55Points: 24Comments: 2

Show HN: I spent 4 years building a UI design tool with only the features I use

2026-02-06 @ 19:27:37Points: 182Comments: 84

I'm a solo developer who's been doing UI/UX work since 2007. Over the years, I watched design tools evolve from lightweight products into bloated feature-heavy platforms. I kept finding myself using a small amount of the features while the rest just mostly got in the way.

So a few years ago I set out to build a design tool just like I wanted. So I built Vecti with what I actually need: pixel-perfect grid snapping, a performant canvas renderer, shared asset libraries, and export/presentation features. No collaborative whiteboarding. No plugin ecosystem. No enterprise features. Just the design loop.

Four years later, I can proudly show it off. Built and hosted in the EU with European privacy regulations. Free tier available (no credit card, one editor forever).

On privacy: I use some basic analytics (page views, referrers) but zero tracking inside the app itself. No session recordings, no behavior analytics, no third-party scripts beyond the essentials.

If you're a solo designer or small team who wants a tool that stays out of your way, I'd genuinely appreciate your feedback: https://vecti.com

Happy to answer questions about the tech stack, architecture decisions, why certain features didn't make the cut, or what's next.

Show HN: If you lose your memory, how to regain access to your computer?

2026-02-06 @ 18:51:58Points: 95Comments: 87

I combined shamir secret sharing (hashicorp vault's implementation) with age-encryption, and packaged it using WASM for a neat in-browser offline UX.

The idea is that if something happens to me, my friends and family would help me get back access to the data that matters most to me. 5 out of 7 friends need to agree for the vault to unlock.

Try out the demo in the website, it runs entirely in your browser!

How to effectively write quality code with AI

2026-02-06 @ 18:49:59Points: 120Comments: 87

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

2026-02-06 @ 17:11:52Points: 21Comments: 9

My friend and I have been experimenting with using LLMs to reason about biotech stocks. Unlike many other sectors, Biotech trading is largely event-driven: FDA decisions, clinical trial readouts, safety updates, or changes in trial design can cause a stock to 3x in a single day (https://www.biotradingarena.com/cases/MDGL_2023-12-14_Resmet...).

Interpreting these ‘catalysts,’ which comes in the form of a press release, usually requires analysts with previous expertise in biology or medicine. A catalyst that sounds “positive” can still lead to a selloff if, for example: the effect size is weaker than expected

- results apply only to a narrow subgroup

- endpoints don’t meaningfully de-risk later phases,

- the readout doesn’t materially change approval odds.

To explore this, we built BioTradingArena, a benchmark for evaluating how well LLMs can interpret biotech catalysts and predict stock reactions. Given only the catalyst and the information available before the date of the press release (trial design, prior data, PubMed articles, and market expectations), the benchmark tests to see how accurate the model is at predicting the stock movement for when the catalyst is released.

The benchmark currently includes 317 historical catalysts. We also created subsets for specific indications (with the largest in Oncology) as different indications often have different patterns. We plan to add more catalysts to the public dataset over the next few weeks. The dataset spans companies of different sizes and creates an adjusted score, since large-cap biotech tends to exhibit much lower volatility than small and mid-cap names.

Each row of data includes:

- Real historical biotech catalysts (Phase 1–3 readouts, FDA actions, etc.) and pricing data from the day before, and the day of the catalyst

- Linked Clinical Trial data, and PubMed pdfs

Note, there are may exist some fairly obvious problems with our approach. First, many clinical trial press releases are likely already included in the LLMs’ pretraining data. While we try to reduce this by ‘de-identifying each press release’, and providing only the data available to the LLM up to the date of the catalyst, there are obviously some uncertainties about whether this is sufficient.

We’ve been using this benchmark to test prompting strategies and model families. Results so far are mixed but interesting as the most reliable approach we found was to use LLMs to quantify qualitative features and then a linear regression of these features, rather than direct price prediction.

Just wanted to share this with HN. I built a playground link for those of you who would like to play around with it in a sandbox. Would love to hear some ideas and hope people can play around with this!

I spent 5 years in DevOps – Solutions engineering gave me what I was missing

2026-02-06 @ 16:45:47Points: 65Comments: 16

The Waymo World Model

2026-02-06 @ 16:20:42Points: 615Comments: 392

Sheldon Brown's Bicycle Technical Info

2026-02-06 @ 15:40:42Points: 258Comments: 61

An Update on Heroku

2026-02-06 @ 15:20:23Points: 234Comments: 179

Microsoft open-sources LiteBox, a security-focused library OS

2026-02-06 @ 15:13:04Points: 265Comments: 132

How virtual textures work

2026-02-06 @ 14:32:05Points: 13Comments: 3

FORTH? Really!?

2026-02-06 @ 13:54:09Points: 5Comments: 2

Hackers (1995) Animated Experience

2026-02-06 @ 13:49:55Points: 322Comments: 187

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

2026-02-06 @ 13:38:18Points: 6Comments: 4

It's not a fork of OpenCode. Instead, it implements the OpenCode protocol and just runs `opencode attach` to the server that converts API calls to the underlying agents.

We build this to scratch our itch of being able to rapidly switch between coding agents based on the task at hand. For example, we find that:

- Claude Code is the best executor & fast iterator - Codex (high) is the best for complex or long-running tasks - OpenCode for fine-tuned, do-exactly-as-I-say edits

I personally believe that harnesses matter almost as much as the models in 2026. OpenCode lets you swap out models already, but the CC & Codex harnesses + system prompts make a big difference in practice.

Under the hood, this is all powered by our Sandbox Agent SDK:

- Sandbox Agent SDK provides a universal HTTP API for controlling Claude Code, Codex, and Amp - Sandbox Agent SDK exposes an OpenCode-compatible endpoint so OpenCode can talk to any agent - OpenCode connects to Sandbox Agent SDK via attach

I want to emphasize: the Anomaly folks are doing awesome work with OpenCode agent + Zen + Black. I use OC regularly alongside CC & Codex depending on the task. Gigacode is only possible because OpenCode is insanely flexible, hackable, and well documented.

Give it a try:

$ curl -fsSL https://releases.rivet.dev/sandbox-agent/latest/gigacode-ins... | sh

Check out the project, architecture, and other install options:

https://github.com/rivet-dev/sandbox-agent/tree/main/gigacod...

I now assume that all ads on Apple news are scams

2026-02-06 @ 12:16:43Points: 889Comments: 390

The mystery of the mole playing rough (2019) [video]

2026-02-06 @ 07:12:38Points: 7Comments: 0

Show HN: Horizons – OSS agent execution engine

2026-02-06 @ 00:58:31Points: 11Comments: 3

It integrates with our sdk for evaluation and optimization but also comes batteries-included with self-hosted implementations. We think Horizons will make building agent-based products a lot easier and help builders focus on their proprietary data, context, and algorithms

Some notes:

- you can configure claude code, codex, opencode to run in the engine. on-demand or on a cron

- we're striving to make it simple to integrate with existing backends via a 2-way event driven interface, but I'm 99.9% sure it'll change as there are a ton of unknown unknowns

- support for mcp, and we are building with authentication (rbac) in mind, although it's a long-journey

- all self-host able via docker

A very simplistic way to think about it - an OSS take on Frontier, or maybe OpenClaw for prod

Show HN: Slack CLI for Agents

2026-02-05 @ 21:38:51Points: 14Comments: 4

  * Can paste in Slack URLs
  * Token efficient
  * Zero-config (auto auth if you use Slack Desktop)
Auto downloads files/snippets. Also can read Slack canvases as markdown!

MIT License

Evaluating and mitigating the growing risk of LLM-discovered 0-days

2026-02-05 @ 17:50:37Points: 13Comments: 2

Show HN: Smooth CLI – Token-efficient browser for AI agents

2026-02-05 @ 16:13:33Points: 66Comments: 52

https://www.smooth.sh) is a browser that agents like Claude Code can use to navigate the web reliably, quickly, and affordably. It lets agents specify tasks using natural language, hiding UI complexity, and allowing them to focus on higher-level intents to carry out complex web tasks. It can also use your IP address while running browsers in the cloud, which helps a lot with roadblocks like captchas (https://docs.smooth.sh/features/use-my-ip).

Here’s a demo: https://www.youtube.com/watch?v=62jthcU705k Docs start at https://docs.smooth.sh.

Agents like Claude Code, etc are amazing but mostly restrained to the CLI, while a ton of valuable work needs a browser. This is a fundamental limitation to what these agents can do.

So far, attempts to add browsers to these agents (Claude’s built-in --chrome, Playwright MCP, agent-browser, etc.) all have interfaces that are unnatural for browsing. They expose hundreds of tools - e.g. click, type, select, etc - and the action space is too complex. (For an example, see the low-level details listed at https://github.com/vercel-labs/agent-browser). Also, they don’t handle the billion edge cases of the internet like iframes nested in iframes nested in shadow-doms and so on. The internet is super messy! Tools that rely on the accessibility tree, in particular, unfortunately do not work for a lot of websites.

We believe that these tools are at the wrong level of abstraction: they make the agent focus on UI details instead of the task to be accomplished.

Using a giant general-purpose model like Opus to click on buttons and fill out forms ends up being slow and expensive. The context window gets bogged down with details like clicks and keystrokes, and the model has to figure out how to do browser navigation each time. A smaller model in a system specifically designed for browsing can actually do this much better and at a fraction of the cost and latency.

Security matters too - probably more than people realize. When you run an agent on the web, you should treat it like an untrusted actor. It should access the web using a sandboxed machine and have minimal permissions by default. Virtual browsers are the perfect environment for that. There’s a good write up by Paul Kinlan that explains this very well (see https://aifoc.us/the-browser-is-the-sandbox and https://news.ycombinator.com/item?id=46762150). Browsers were built to interact with untrusted software safely. They’re an isolation boundary that already works.

Smooth CLI is a browser designed for agents based on what they’re good at. We expose a higher-level interface to let the agent think in terms of goals and tasks, not low-level details.

For example, instead of this:

  click(x=342, y=128)
  type("search query")
  click(x=401, y=130)
  scroll(down=500)
  click(x=220, y=340)
  ...50 more steps
Your agent just says:

  Search for flights from NYC to LA and find the cheapest option
Agents like Claude Code can use the Smooth CLI to extract hard-to-reach data, fill-in forms, download files, interact with dynamic content, handle authentication, vibe-test apps, and a lot more.

Smooth enables agents to launch as many browsers and tasks as they want, autonomously, and on-demand. If the agent is carrying out work on someone’s behalf, the agent’s browser presents itself to the web as a device on the user’s network. The need for this feature may diminish over time, but for now it’s a necessary primitive. To support this, Smooth offers a “self” proxy that creates a secure tunnel and routes all browser traffic through your machine’s IP address (https://docs.smooth.sh/features/use-my-ip). This is one of our favorite features because it makes the agent look like it’s running on your machine, while keeping all the benefits of running in the cloud.

We also take away as much security responsibility from the agent as possible. The agent should not be aware of authentication details or be responsible for handling malicious behavior such as prompt injections. While some security responsibility will always remain with the agent, the browser should minimize this burden as much as possible.

We’re biased of course, but in our tests, running Claude with Smooth CLI has been 20x faster and 5x cheaper than Claude Code with the --chrome flag (https://www.smooth.sh/images/comparison.gif). Happy to explain further how we’ve tested this and to answer any questions about it!

Instructions to install: https://docs.smooth.sh/cli. Plans and pricing: https://docs.smooth.sh/pricing.

It’s free to try, and we'd love to get feedback/ideas if you give it a go :)

We’d love to hear what you think, especially if you’ve tried using browsers with AI agents. Happy to answer questions, dig into tradeoffs, or explain any part of the design and implementation!

Claude Composer

2026-02-04 @ 20:59:29Points: 67Comments: 47

Understanding Neural Network, Visually

2026-02-03 @ 14:49:11Points: 199Comments: 24

Learning from context is harder than we thought

2026-02-03 @ 13:07:37Points: 105Comments: 54

The Beauty of Slag

2026-02-03 @ 08:23:23Points: 10Comments: 2

Archives

2026

2025

2024

2023

2022