Hacker News

Latest

China Moon Mission: Aiming for 2030 Lunar Landing

2026-02-03 @ 19:32:11Points: 45Comments: 25

AliSQL: Alibaba's open-source MySQL with vector and DuckDB engines

2026-02-03 @ 18:40:18Points: 73Comments: 6

Y Combinator will let founders receive funds in stablecoins

2026-02-03 @ 18:28:48Points: 39Comments: 47

Xcode 26.3 unlocks the power of agentic coding

2026-02-03 @ 18:04:08Points: 169Comments: 114

Sandboxing AI Agents in Linux

2026-02-03 @ 17:35:37Points: 32Comments: 20

Deno Sandbox

2026-02-03 @ 17:33:20Points: 193Comments: 69

Migrate Wizard – IMAP Based Email Migration Tool

2026-02-03 @ 17:19:47Points: 17Comments: 17

Defining Safe Hardware Design [pdf]

2026-02-03 @ 17:12:04Points: 28Comments: 4

Show HN: Octosphere, a tool to decentralise scientific publishing

2026-02-03 @ 17:11:42Points: 27Comments: 11

https://www.octopus.ac/), so I got a bit excited over the weekend and built Octosphere.

Hopefully some of you find it interesting! Blog post here: https://andreasthinks.me/posts/octosphere/octosphere.html

Show HN: I built "AI Wattpad" to eval LLMs on fiction

2026-02-03 @ 17:08:43Points: 15Comments: 20

https://narrator.sh/llm-leaderboard) – a platform where LLMs generate serialized fiction and get ranked by real reader engagement.

Turns out this is surprisingly hard to answer. Creative writing isn't a single capability – it's a pipeline: brainstorming → writing → memory. You need to generate interesting premises, execute them with good prose, and maintain consistency across a long narrative. Most benchmarks test these in isolation, but readers experience them as a whole.

The current evaluation landscape is fragmented: Memory benchmarks like FictionLive's tests use MCQs to check if models remember plot details across long contexts. Useful, but memory is necessary for good fiction, not sufficient. A model can ace recall and still write boring stories.

Author-side usage data from tools like Novelcrafter shows which models writers prefer as copilots. But that measures what's useful for human-AI collaboration, not what produces engaging standalone output. Authors and readers have different needs.

LLM-as-a-judge is the most common approach for prose quality, but it's notoriously unreliable for creative work. Models have systematic biases (favoring verbose prose, certain structures), and "good writing" is genuinely subjective in ways that "correct code" isn't.

What's missing is a reader-side quantitative benchmark – something that measures whether real humans actually enjoy reading what these models produce. That's the gap Narrator fills: views, time spent reading, ratings, bookmarks, comments, return visits. Think of it as an "AI Wattpad" where the models are the authors.

I shared an early DSPy-based version here 5 months ago (https://news.ycombinator.com/item?id=44903265). The big lesson: one-shot generation doesn't work for long-form fiction. Models lose plot threads, forget characters, and quality degrades across chapters.

The rewrite: from one-shot to a persistent agent loop

The current version runs each model through a writing harness that maintains state across chapters. Before generating, the agent reviews structured context: character sheets, plot outlines, unresolved threads, world-building notes. After generating, it updates these artifacts for the next chapter. Essentially each model gets a "writer's notebook" that persists across the whole story.

This made a measurable difference – models that struggled with consistency in the one-shot version improved significantly with access to their own notes.

Granular filtering instead of a single score:

We classify stories upfront by language, genre, tags, and content rating. Instead of one "creative writing" leaderboard, we can drill into specifics: which model writes the best Spanish Comedy? Which handles LitRPG stories with Male Leads the best? Which does well with romance versus horror?

The answers aren't always what you'd expect from general benchmarks. Some models that rank mid-tier overall dominate specific niches.

A few features I'm proud of:

Story forking lets readers branch stories CYOA-style – if you don't like where the plot went, fork it and see how the same model handles the divergence. Creates natural A/B comparisons.

Visual LitRPG was a personal itch to scratch. Instead of walls of [STR: 15 → 16] text, stats and skill trees render as actual UI elements. Example: https://narrator.sh/novel/beware-the-starter-pet/chapter/1

What I'm looking for:

More readers to build out the engagement data. Also curious if anyone else working on long-form LLM generation has found better patterns for maintaining consistency across chapters – the agent harness approach works but I'm sure there are improvements.

Show HN: PII-Shield – Log Sanitization Sidecar with JSON Integrity (Go, Entropy)

2026-02-03 @ 16:40:12Points: 12Comments: 7

Why deterministic? So that "pass123" always hashes to the same "[HIDDEN:a1b2c]", allowing QA/Devs to correlate errors without seeing the raw data.

Key features: 1. JSON Integrity: It parses JSON, sanitizes values, and rebuilds it. It guarantees valid JSON output for your SIEM (ELK/Datadog). 2. Entropy Detection: Uses context-aware entropy analysis to catch high-randomness strings. 3. Fail-Open: Designed as a transparent pipe wrapper to preserve app uptime.

The project is open-source (Apache 2.0).

Repo: https://github.com/aragossa/pii-shield Docs: https://pii-shield.gitbook.io/docs/

I'd love your feedback on the entropy/threshold logic!

France dumps Zoom and Teams as Europe seeks digital autonomy from the US

2026-02-03 @ 16:39:18Points: 483Comments: 277

Prek: A better, faster, drop-in pre-commit replacement, engineered in Rust

2026-02-03 @ 16:29:34Points: 133Comments: 62

Tadpole – A modular and extensible DSL built for web scraping

2026-02-03 @ 16:29:13Points: 26Comments: 5

X offices raided in France

2026-02-03 @ 16:14:17Points: 164Comments: 127

Show HN: C discrete event SIM w stackful coroutines runs 45x faster than SimPy

2026-02-03 @ 16:09:07Points: 36Comments: 14

I have built Cimba, a multithreaded discrete event simulation library in C.

Cimba uses POSIX pthread multithreading for parallel execution of multiple simulation trials, while coroutines provide concurrency inside each simulated trial universe. The simulated processes are based on asymmetric stackful coroutines with the context switching hand-coded in assembly.

The stackful coroutines make it natural to express agentic behavior by conceptually placing oneself "inside" that process and describing what it does. A process can run in an infinite loop or just act as a one-shot customer passing through the system, yielding and resuming execution from any level of its call stack, acting both as an active agent and a passive object as needed. This is inspired by my own experience programming in Simula67, many moons ago, where I found the coroutines more important than the deservedly famous object-orientation.

Cimba turned out to run really fast. In a simple benchmark, 100 trials of an M/M/1 queue run for one million time units each, it ran 45 times faster than an equivalent model built in SimPy + Python multiprocessing. The running time was reduced by 97.8 % vs the SimPy model. Cimba even processed more simulated events per second on a single CPU core than SimPy could do on all 64 cores.

The speed is not only due to the efficient coroutines. Other parts are also designed for speed, such as a hash-heap event queue (binary heap plus Fibonacci hash map), fast random number generators and distributions, memory pools for frequently used object types, and so on.

The initial implementation supports the AMD64/x86-64 architecture for Linux and Windows. I plan to target Apple Silicon next, then probably ARM.

I believe this may interest the HN community. I would appreciate your views on both the API and the code. Any thoughts on future target architectures to consider?

Docs: https://cimba.readthedocs.io/en/latest/

Repo: https://github.com/ambonvik/cimba

Launch HN: Modelence (YC S25) – App Builder with TypeScript / MongoDB Framework

2026-02-03 @ 16:03:21Points: 45Comments: 23

https://modelence.com). After spending years on scaling our previous startup’s platform, we built an open-source full-stack TypeScript + MongoDB framework to stop solving the same auth / database / API / cron job implementations every time we created an app, and we didn’t like the idea of using multiple managed platforms for each of these to run our apps either.

(Here’s our prior Show HN post for reference: https://news.ycombinator.com/item?id=44902227)

At the same time, we were excited by the whole AI app builder boom and realized that the real challenge there is the platform rather than the tool itself. Now we’re making Modelence the first full-stack framework that’s built for coding agents and humans alike:

- TypeScript is already great for AI coding because it provides guardrails and catches many errors at build time, so agents can auto-correct

- MongoDB eliminates the schema management problem for agents, which is where they fail the most often otherwise (+ works great with TS/Node.js)

- Built-in auth, database, cron jobs and else that just works together out of the box means agents only focus on your product logic and don’t fail at trying to set these things up (+ less tokens spent on boilerplate).

You can now try the Modelence app builder (based on Claude Agent SDK) by just typing a prompt on our landing page ( https://modelence.com ) - watch a demo video here: https://youtu.be/BPsYvj_nGuE

Then you can check it out locally and continue working in your own IDE, while still using Modelence Cloud as your backend, with a dev cloud environment, and later deploy and run on Modelence Cloud with built-in observability around every operation running in your app.

We’re also going to add a built-in DevOps agent that lives in the same cloud, knows the framework end-to-end, and will use all this observability data to act on errors, alerts, and incidents - closing the loop, because running in production is much harder than just building.

We launched the app builder as a quick start for developers, to demonstrate the framework and Modelence Cloud without having to manually read docs and follow the steps to set up a new app. Our main focus is still the platform itself, since we believe the real challenge in AI coding is the framework and the platform rather than the builder tool itself.

Qwen3-Coder-Next

2026-02-03 @ 16:01:50Points: 447Comments: 252

The next steps for Airbus' big bet on open rotor engines

2026-02-03 @ 15:31:40Points: 52Comments: 44

Show HN: Sandboxing untrusted code using WebAssembly

2026-02-03 @ 14:28:01Points: 53Comments: 18

I built a runtime to isolate untrusted code using wasm sandboxes.

Basically, it protects your host system from problems that untrusted code can cause. We’ve had a great discussion about sandboxing in Python lately that elaborates a bit more on the problem [1]. In TypeScript, wasm integration is even more natural thanks to the close proximity between both ecosystems.

The core is built in Rust. On top of that, I use WASI 0.2 via wasmtime and the component model, along with custom SDKs that keep things as idiomatic as possible.

For example, in Python we have a simple decorator:

  from capsule import task

  @task(
      name="analyze_data", 
      compute="MEDIUM",
      ram="512mb",
      allowed_files=["./authorized-folder/"],
      timeout="30s", 
      max_retries=1
  )
  def analyze_data(dataset: list) -> dict:
      """Process data in an isolated, resource-controlled environment."""
      # Your code runs safely in a Wasm sandbox
      return {"processed": len(dataset), "status": "complete"}
And in TypeScript we have a wrapper:

  import { task } from "@capsule-run/sdk"

  export const analyze = task({
      name: "analyzeData", 
      compute: "MEDIUM", 
      ram: "512mb",
      allowedFiles: ["./authorized-folder/"],
      timeout: 30000, 
      maxRetries: 1
  }, (dataset: number[]) => {
      return {processed: dataset.length, status: "complete"}
  });
You can set CPU (with compute), memory, filesystem access, and retries to keep precise control over your tasks.

It's still quite early, but I'd love feedback. I’ll be around to answer questions.

GitHub: https://github.com/mavdol/capsule

[1] https://news.ycombinator.com/item?id=46500510

Agent Skills

2026-02-03 @ 14:09:54Points: 295Comments: 181

Bunny Database

2026-02-03 @ 12:13:44Points: 200Comments: 90

Emerge Career (YC S22) is hiring a product designer

2026-02-03 @ 12:00:23Points: 1

What's up with all those equals signs anyway?

2026-02-03 @ 09:37:40Points: 533Comments: 163

Show HN: Safe-now.live – Ultra-light emergency info site (<10KB)

2026-02-03 @ 09:06:04Points: 156Comments: 68

https://news.ycombinator.com/item?id=46494734) , I built safe-now.live – a text-first emergency info site for USA and Canada. No JavaScript, no images, under 10KB. Pulls live FEMA disasters, NWS alerts, weather, and local resources. This is my first live website ever so looking for critical feedback on the website. Please feel free to look around.

https://safe-now.live

Floppinux – An Embedded Linux on a Single Floppy, 2025 Edition

2026-02-03 @ 04:33:25Points: 226Comments: 155

Another London: Excavating the disenchanted city

2026-02-01 @ 19:32:41Points: 15Comments: 0

Heritability of intrinsic human life span is about 50%

2026-02-01 @ 12:13:22Points: 113Comments: 71

Puget Systems Most Reliable Hardware of 2025

2026-01-31 @ 04:34:25Points: 35Comments: 8

The Everdeck: A Universal Card System (2019)

2026-01-28 @ 14:38:21Points: 80Comments: 19

Archives

2026

2025

2024

2023

2022