Hacker News

Latest

One Startup Is Gambling. Ten Is Mathematics

2026-05-13 @ 06:42:54Points: 6

Deterministic Fully-Static Whole-Binary Translation Without Heuristics

2026-05-13 @ 04:25:03Points: 177Comments: 41

Starship V3

2026-05-13 @ 01:29:31Points: 227Comments: 344

My graduation cap runs Rust

2026-05-13 @ 00:04:21Points: 151Comments: 51

When "idle" isn't idle: how a Linux kernel optimization became a QUIC bug

2026-05-12 @ 23:46:28Points: 91Comments: 7

Kraftwerk's radical 1976 track

2026-05-12 @ 23:13:01Points: 159Comments: 110

Tell NYT, Atlantic, USA Today to keep Wayback Machine

2026-05-12 @ 23:11:40Points: 334Comments: 92

Restore full BambuNetwork support for Bambu Lab printers

2026-05-12 @ 21:55:21Points: 432Comments: 185

Scrcpy v4.0

2026-05-12 @ 20:50:02Points: 195Comments: 28

How to make your text look futuristic (2016)

2026-05-12 @ 20:16:26Points: 347Comments: 45

CERT is releasing six CVEs for serious security vulnerabilities in dnsmasq

2026-05-12 @ 18:12:28Points: 315Comments: 150

Show HN: Needle: We Distilled Gemini Tool Calling into a 26M Model

2026-05-12 @ 18:03:11Points: 464Comments: 154

We were always frustrated by the little effort made towards building agentic models that run on budget phones, so we conducted investigations that led to an observation: agentic experiences are built upon tool calling, and massive models are overkill for it. Tool calling is fundamentally retrieval-and-assembly (match query to tool name, extract argument values, emit JSON), not reasoning. Cross-attention is the right primitive for this, and FFN parameters are wasted at this scale.

Simple Attention Networks: the entire model is just attention and gating, no MLPs anywhere. Needle is an experimental run for single-shot function calling for consumer devices (phones, watches, glasses...).

Training: - Pretrained on 200B tokens across 16 TPU v6e (27 hours) - Post-trained on 2B tokens of synthesized function-calling data (45 minutes) - Dataset synthesized via Gemini with 15 tool categories (timers, messaging, navigation, smart home, etc.)

You can test it right now and finetune on your Mac/PC: https://github.com/cactus-compute/needle

The full writeup on the architecture is here: https://github.com/cactus-compute/needle/blob/main/docs/simp...

We found that the "no FFN" finding generalizes beyond function calling to any task where the model has access to external structured knowledge (RAG, tool use, retrieval-augmented generation). The model doesn't need to memorize facts in FFN weights if the facts are provided in the input. Experimental results to published.

While it beats FunctionGemma-270M, Qwen-0.6B, Granite-350M, LFM2.5-350M on single-shot function calling, those models have more scope/capacity and excel in conversational settings. We encourage you to test on your own tools via the playground and finetune accordingly.

This is part of our broader work on Cactus (https://github.com/cactus-compute/cactus), an inference engine built from scratch for mobile, wearables and custom hardware. We wrote about Cactus here previously: https://news.ycombinator.com/item?id=44524544

Everything is MIT licensed. Weights: https://huggingface.co/Cactus-Compute/needle GitHub: https://github.com/cactus-compute/needle

Quack: The DuckDB Client-Server Protocol

2026-05-12 @ 17:54:12Points: 286Comments: 56

Reimagining the mouse pointer for the AI era

2026-05-12 @ 17:40:13Points: 204Comments: 172

Googlebook

2026-05-12 @ 17:37:36Points: 783Comments: 1300

As researchers age, they produce less disruptive work

2026-05-12 @ 17:16:59Points: 73Comments: 69

Show HN: Agentic interface for mainframes and COBOL

2026-05-12 @ 17:10:22Points: 76Comments: 41

https://www.hypercubic.ai/), bringing AI tools to the mainframe and COBOL world. (We did a Launch HN last year: https://news.ycombinator.com/item?id=45877517.) Today we’re launching Hopper, an agentic development environment for mainframes.

You can download it here: https://www.hypercubic.ai/hopper, and you can also request access and immediately get a mainframe user account to play with.

There's also a video runthrough at https://www.youtube.com/watch?v=q81L5DcfBvE.

Mainframes still run a surprising amount of critical infrastructure: banking, payments, insurance, airlines, government programs, logistics, and core operations at large institutions. Many of these systems are decades old, but they continue to process enormous transaction volumes because they are reliable, secure, and deeply embedded into business operations.

A lot of that software is written in COBOL and runs on IBM z/OS. The development environment looks very different from modern cloud or Unix-style development. Instead of GitHub, shell commands, package managers, and CI pipelines, developers often work through TN3270 terminal sessions, ISPF panels, partitioned datasets, JCL, JES queues, spool output, return codes, VSAM files, CICS transactions, and shop-specific conventions.

TN3270 is the terminal interface used to interact with many IBM mainframe systems. ISPF is the menu and panel system developers use inside that terminal to browse datasets, edit source, submit jobs, and inspect output. It is powerful and reliable, but it was designed for expert humans navigating screens, function keys, and fixed-width workflows, not AI agents.

A simple COBOL change might require finding the right source member, checking copybooks, locating compile JCL, submitting a job, reading JES/SYSPRINT output, interpreting condition codes, patching fixed-width source, and resubmitting.

Much of this work is so well-defined and repetitive that it's a good fit for agentic AI. To get that working, however, a chatbot next to a terminal is not enough. The agent needs to operate inside the mainframe environment.

Hopper combines three things: (1) A real TN3270 terminal, (2) Mainframe-aware panels for datasets, members, jobs, and spool output, and (3) An AI agent that can operate across those z/OS surfaces.

For example, here is a tiny version of the kind of thing Hopper can help debug:

  COBOL:

   IDENTIFICATION DIVISION.
   PROGRAM-ID. PAYCALC.

   DATA DIVISION.
   WORKING-STORAGE SECTION.
   01  CUSTOMER-BALANCE     PIC 9(7)V99.

   PROCEDURE DIVISION.
       ADD 100.00 TO CUSTOMER-BALNCE
       DISPLAY "UPDATED BALANCE: " CUSTOMER-BALANCE
       STOP RUN.


  JCL:

    //PAYCOMP  JOB (ACCT),'COMPILE',CLASS=A,MSGCLASS=X
    
    //COBOL    EXEC IGYWCL
    
    [//COBOL.SYSIN](https://cobol.sysin/) DD DSN=USER1.APP.COBOL(PAYCALC),DISP=SHR
    
    [//LKED.SYSLMOD](https://lked.syslmod/) DD DSN=USER1.APP.LOAD(PAYCALC),DISP=SHR

A human would submit this job, inspect JES output, open `SYSPRINT`, find the undefined `CUSTOMER-BALNCE`, map it back to the source, patch the member, and resubmit. Hopper is designed to let an agent operate through that same loop autonomously.

Hopper is not trying to hide the mainframe behind a generic abstraction, and it's not a chatbot. The design principle is simple: preserve the fidelity of the mainframe environment, but make it accessible to AI agents.

Sensitive operations require approval, and the terminal remains visible at all times.

Once agents can operate inside the mainframe environment, new workflows become possible: faster job debugging, automated documentation, safer code changes, test generation, migration planning, traffic replay, and modernization verification.

We’re curious to hear your thoughts! especially from anyone who has worked with mainframes, COBOL or has done legacy enterprise modernization.

The Future of Obsidian Plugins

2026-05-12 @ 15:45:54Points: 379Comments: 142

Launch HN: Voker (YC S24) – Analytics for AI Agents

2026-05-12 @ 15:45:20Points: 53Comments: 20

https://voker.ai/), an agent analytics platform for AI product teams. Voker gives full visibility into what users are asking of your agents, and whether your agents are delivering, without having to dig through logs. Our main product is a lightweight SDK that is LLM stack agnostic and purpose-built for agent products. (https://app.voker.ai/docs)

Agent Engineers and AI product teams don’t have the right level of visibility into agent performance in production, which results in bad user experiences, churn, and hundreds of hours wasted with spot checks to find and debug issues with agent configurations.

Demo: https://www.tella.tv/video/vid_cmoukcsk1000i07jgb4j65u67/vie...

We recently conducted a survey of YC Founders and 90%+ of respondents said that the only way they know if their Agents are failing users in production is by hearing complaints from customers. They push a prompt change hoping that it fixes the problem and doesn’t break something somewhere else, and the cycle repeats.

We saw tons of observability and evals products popping up to try to address these problems, but we still felt like something was missing in the agent monitoring stack. Obs is good for individual trace debugging but is only accessible to engineers. Evals are good for testing known issues, but don't give insights into trends that teams don’t expect, so engineers are always playing catch up. Traditional product analytics tools do a good job tracking clicks and pageviews across your product surface but weren’t built ground up for agent products. Knowing what users want out of agents, and whether the agent delivered requires specific conversational intelligence / unstructured data processing techniques.

We came up with the agent analytics primitives of Intents, Corrections, and Resolutions to describe something pretty much all conversational agents had in common: a user will always come to an agent with an intent, the user might have to correct this agent on the way to getting their intent resolved, and hopefully every intent a user has is eventually resolved by the agent. Voker processes LLM calls by automatically annotating individual conversations and picking out user intent and corrections. Voker takes these and uses LLMs and hierarchical text classification to create dynamic categories that give higher level insights so you don’t have to read individual conversations to know what are the main usage patterns across your users.

The most common substitute solution we’ve seen is uploading obs logs to Claude or ChatGPT and asking for summary insights. There are a few problems with this - mainly that LLMs aren’t good at math or data science, so you don’t get accurate or consistent statistics. Its highly likely that the LLM overfits to some insights and underfits to others. The LLM isn’t programmatically reading and classifying each individual session or interaction. This is why we don’t use LLMs for any of our core data engineering (processing events, calculating statistics) so the analytics we produce are consistent, reproducible, and accurate. We have a publicly available, lightweight SDK that wraps LLM calls to OpenAI, Anthropic and Gemini in Python and Typescript. Voker handles the data engineering to turn raw data into usable analytics primitives and higher level insights. Free tier: 2,000 events / mo, requires email signup. Paid plans start at $80/mo with a 30 day free trial.

We'd love to hear how you're currently detecting trends, and if you try Voker, tell us what part of our analysis is valuable, and what still feels missing. Thanks for reading, and we’re looking forward to your thoughts in the comments!

Why senior developers fail to communicate their expertise

2026-05-12 @ 15:08:40Points: 578Comments: 252

Bambu Lab is abusing the open source social contract

2026-05-12 @ 14:54:41Points: 1270Comments: 395

Rendering the Sky, Sunsets, and Planets

2026-05-12 @ 13:26:46Points: 475Comments: 39

Traceway: MIT-licensed observability stack you can self-host in ~90s

2026-05-11 @ 07:05:01Points: 112Comments: 9

Up in Smoke

2026-05-11 @ 02:13:02Points: 23Comments: 2

Referer Reality

2026-05-10 @ 20:31:10Points: 40Comments: 13

I made rust's cargo copy but for CPP

2026-05-10 @ 14:44:17Points: 13Comments: 4

Fc, a lossless compressor for floating-point streams

2026-05-10 @ 10:14:20Points: 70Comments: 13

Lanzaboote – NixOS Secure Boot

2026-05-09 @ 18:55:59Points: 88Comments: 8

When life gives you lemons, write better error messages

2026-05-08 @ 21:31:44Points: 151Comments: 54

The vi family

2026-05-06 @ 07:51:16Points: 170Comments: 95

Archives

2026

2025

2024

2023

2022