Hacker News

Latest

Setting up a free *.city.state.us locality domain

2026-05-13 @ 14:45:18Points: 239Comments: 61

The AI Backlash Could Get Ugly

2026-05-13 @ 14:44:41Points: 50Comments: 109

Dutch suicide prevention website shares data with tech companies without consent

2026-05-13 @ 12:57:42Points: 222Comments: 162

Why I'm leaving GitHub for Forgejo

2026-05-13 @ 12:54:00Points: 381Comments: 205

Substrate (YC S24) Is Hiring a Technical Success Manager

2026-05-13 @ 12:00:30Points: 1

I Moved My Digital Stack to Europe

2026-05-13 @ 11:42:20Points: 663Comments: 448

Using OR-Tools CP-SAT for Scheduling Problems

2026-05-13 @ 11:02:57Points: 55Comments: 22

Deterministic Fully-Static Whole-Binary Translation Without Heuristics

2026-05-13 @ 04:25:03Points: 270Comments: 64

My graduation cap runs Rust

2026-05-13 @ 00:04:21Points: 201Comments: 79

When “idle” isn't idle: how a Linux kernel optimization became a QUIC bug

2026-05-12 @ 23:46:28Points: 147Comments: 27

Kraftwerk's radical 1976 track

2026-05-12 @ 23:13:01Points: 216Comments: 187

Restore full BambuNetwork support for Bambu Lab printers

2026-05-12 @ 21:55:21Points: 611Comments: 271

Scrcpy v4.0

2026-05-12 @ 20:50:02Points: 333Comments: 50

How to make your text look futuristic (2016)

2026-05-12 @ 20:16:26Points: 448Comments: 56

CERT is releasing six CVEs for serious security vulnerabilities in dnsmasq

2026-05-12 @ 18:12:28Points: 357Comments: 197

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

2026-05-12 @ 18:03:11Points: 580Comments: 168

We were always frustrated by the little effort made towards building agentic models that run on budget phones, so we conducted investigations that led to an observation: agentic experiences are built upon tool calling, and massive models are overkill for it. Tool calling is fundamentally retrieval-and-assembly (match query to tool name, extract argument values, emit JSON), not reasoning. Cross-attention is the right primitive for this, and FFN parameters are wasted at this scale.

Simple Attention Networks: the entire model is just attention and gating, no MLPs anywhere. Needle is an experimental run for single-shot function calling for consumer devices (phones, watches, glasses...).

Training: - Pretrained on 200B tokens across 16 TPU v6e (27 hours) - Post-trained on 2B tokens of synthesized function-calling data (45 minutes) - Dataset synthesized via Gemini with 15 tool categories (timers, messaging, navigation, smart home, etc.)

You can test it right now and finetune on your Mac/PC: https://github.com/cactus-compute/needle

The full writeup on the architecture is here: https://github.com/cactus-compute/needle/blob/main/docs/simp...

We found that the "no FFN" finding generalizes beyond function calling to any task where the model has access to external structured knowledge (RAG, tool use, retrieval-augmented generation). The model doesn't need to memorize facts in FFN weights if the facts are provided in the input. Experimental results to published.

While it beats FunctionGemma-270M, Qwen-0.6B, Granite-350M, LFM2.5-350M on single-shot function calling, those models have more scope/capacity and excel in conversational settings. We encourage you to test on your own tools via the playground and finetune accordingly.

This is part of our broader work on Cactus (https://github.com/cactus-compute/cactus), an inference engine built from scratch for mobile, wearables and custom hardware. We wrote about Cactus here previously: https://news.ycombinator.com/item?id=44524544

Everything is MIT licensed. Weights: https://huggingface.co/Cactus-Compute/needle GitHub: https://github.com/cactus-compute/needle

Quack: The DuckDB Client-Server Protocol

2026-05-12 @ 17:54:12Points: 355Comments: 75

Googlebook

2026-05-12 @ 17:37:36Points: 887Comments: 1470

As researchers age, they produce less disruptive work

2026-05-12 @ 17:16:59Points: 131Comments: 120

Why senior developers fail to communicate their expertise

2026-05-12 @ 15:08:40Points: 738Comments: 313

Rendering the Sky, Sunsets, and Planets

2026-05-12 @ 13:26:46Points: 516Comments: 40

An idiot's guide to lead optimisation for proteins

2026-05-11 @ 11:21:33Points: 89Comments: 5

Traceway: MIT-licensed observability stack you can self-host in ~90s

2026-05-11 @ 07:05:01Points: 154Comments: 38

Preserving Fisher-Price Pixter

2026-05-11 @ 06:52:48Points: 161Comments: 31

New stainless steel can survive conditions for hydrogen production in seawater

2026-05-11 @ 01:05:59Points: 229Comments: 103

Web Server on a Nintendo Wii

2026-05-09 @ 21:32:16Points: 77Comments: 24

Reverting the incremental GC in Python 3.14 and 3.15

2026-05-09 @ 20:25:55Points: 123Comments: 36

Nailing jelly to a wall: is it possible? (2005)

2026-05-09 @ 16:50:51Points: 39Comments: 14

The Boring Part of Bell Labs (2025)

2026-05-08 @ 13:00:19Points: 105Comments: 16

The vi family

2026-05-06 @ 07:51:16Points: 252Comments: 164

Archives

2026

2025

2024

2023

2022