Hacker News
Latest
I don't know how you get here from "predict the next word."
2026-02-26 @ 04:59:01Points: 124Comments: 152
Self-improving software won't produce Skynet
2026-02-26 @ 03:36:57Points: 26Comments: 20
RAM now represents 35 percent of bill of materials for HP PCs
2026-02-26 @ 02:43:26Points: 222Comments: 139
Show HN: OpenSwarm – Multi‑Agent Claude CLI Orchestrator for Linear/GitHub
2026-02-26 @ 02:19:02Points: 24Comments: 13
Show HN: ZSE – Open-source LLM inference engine with 3.9s cold starts
2026-02-26 @ 01:15:25Points: 49Comments: 4
The problem I was trying to solve: Running a 32B model normally requires ~64 GB VRAM. Most developers don't have that. And even when quantization helps with memory, cold starts with bitsandbytes NF4 take 2+ minutes on first load and 45–120 seconds on warm restarts — which kills serverless and autoscaling use cases.
What ZSE does differently:
Fits 32B in 19.3 GB VRAM (70% reduction vs FP16) — runs on a single A100-40GB
Fits 7B in 5.2 GB VRAM (63% reduction) — runs on consumer GPUs
Native .zse pre-quantized format with memory-mapped weights: 3.9s cold start for 7B, 21.4s for 32B — vs 45s and 120s with bitsandbytes, ~30s for vLLM
All benchmarks verified on Modal A100-80GB (Feb 2026)
It ships with:
OpenAI-compatible API server (drop-in replacement)
Interactive CLI (zse serve, zse chat, zse convert, zse hardware)
Web dashboard with real-time GPU monitoring
Continuous batching (3.45× throughput)
GGUF support via llama.cpp
CPU fallback — works without a GPU
Rate limiting, audit logging, API key auth
Install:
----- pip install zllm-zse zse serve Qwen/Qwen2.5-7B-Instruct For fast cold starts (one-time conversion):
----- zse convert Qwen/Qwen2.5-Coder-7B-Instruct -o qwen-7b.zse zse serve qwen-7b.zse # 3.9s every time
The cold start improvement comes from the .zse format storing pre-quantized weights as memory-mapped safetensors — no quantization step at load time, no weight conversion, just mmap + GPU transfer. On NVMe SSDs this gets under 4 seconds for 7B. On spinning HDDs it'll be slower.
All code is real — no mock implementations. Built at Zyora Labs. Apache 2.0.
Happy to answer questions about the quantization approach, the .zse format design, or the memory efficiency techniques.
Tech companies shouldn't be bullied into doing surveillance
2026-02-26 @ 00:37:32Points: 228Comments: 70
First Website (1992)
2026-02-25 @ 23:02:58Points: 198Comments: 41
How will OpenAI compete?
2026-02-25 @ 22:29:25Points: 200Comments: 234
An autopsy of AI-generated 3D slop
2026-02-25 @ 21:05:15Points: 66Comments: 41
Making MCP cheaper via CLI
2026-02-25 @ 20:29:37Points: 203Comments: 84
Jimi Hendrix was a systems engineer
2026-02-25 @ 20:16:47Points: 450Comments: 146
PA bench: Evaluating web agents on real world personal assistant workflows
2026-02-25 @ 20:11:37Points: 33Comments: 3
We built PA Bench (Personal Assistant Benchmark) to evaluate frontier computer/web use models on their ability to handle multi-step workflows across simulated clones of Gmail and Calendar.
*What’s next:*
We’re currently scaling the dataset to 3+ tabs and are building more high-fidelity simulations for common enterprise workflows. We’d love to hear feedback on the benchmark and notes about what was/wasn’t surprising about the results.
Blog post: https://vibrantlabs.com/blog/pa-bench
Google API keys weren't secrets, but then Gemini changed the rules
2026-02-25 @ 19:54:14Points: 473Comments: 99
The Om Programming Language
2026-02-25 @ 17:48:21Points: 258Comments: 65
Windows 11 Notepad to support Markdown
2026-02-25 @ 17:14:19Points: 264Comments: 406
Bus stop balancing is fast, cheap, and effective
2026-02-25 @ 16:31:26Points: 351Comments: 502
GNU Texmacs
2026-02-25 @ 15:37:29Points: 158Comments: 44
Show HN: Respectify – A comment moderator that teaches people to argue better
2026-02-25 @ 14:21:19Points: 151Comments: 142
Current moderation tools just seem to focus on deletion and banning. Wouldn’t it be helpful to encourage productive discussion and teach people how to discuss and argue (in the debate sense) better?
A year ago we started building Respectify to help foster healthy communication. Instead of just deleting bad-faith comments, we suggest better, good-faith ways to say what folks are trying to say. We help people avoid: * Logical fallacies (false dichotomy, strawmen, etc.) * Tone issues (how others will read the comment) * Relevance to the actual page/post topic * Low-effort posts * Dog whistles and coded language
The commenter gets an explanation of what's wrong and a chance to edit and resubmit. It's moderation + education in one step. We want, too, to automate the entire process so the site owner can focus on content and not worry about moderation at all. And over time, comment by comment, quietly coach better thinking.
Our main website has an interactive demo: https://respectify.ai. As the demo shows, the system is completely tunable and adjustable, from "most anything goes" to "You need to be college debate level to get by me".
We hope the result is better discussions and a better Internet. Not too much to ask, eh?
We love the kind of feedback this group is famous for and hope you will supply some!
Launch HN: TeamOut (YC W22) – AI agent for planning company retreats
2026-02-25 @ 14:02:02Points: 50Comments: 57
Here’s a demo: https://www.youtube.com/watch?v=QVyc-x-isjI. The product is live at https://app.teamout.com/ai and does not require signup.
We went through YC in 2022 but did not launch on HN at the time. Back then, the product was more traditional, closer to an Airbnb-style search marketplace. Over the past two years, after helping organize more than 1,200 events, we rebuilt the core system around an agent architecture that directly manages the planning process. With this new version live, it felt like the right moment to share it here since it represents a fundamentally different approach to planning events.
The problem: Planning a company retreat usually means choosing between three imperfect options: (1) Hire an event planner and pay significant fees and venue markups; (2) Do it yourself and spend dozens of hours on research, emails, and negotiation; or (3) Use tools like Airbnb that are not designed for group logistics or meeting space.
The difficulty is not just finding a venue. Even for 30 to 50 people, planning turns into weeks of back-and-forth emails for quotes, comparing inconsistent pricing across PDFs, and tracking budgets in spreadsheets. It becomes an ongoing coordination problem with evolving constraints and slow, asynchronous vendor responses. Most existing software is form-driven, but the real workflow is conversational and stateful.
Offsites are expensive and high stakes. A single event can represent a significant chunk of a team’s annual budget, and mistakes show up directly as cost overruns or poor experiences. Founders and operators often end up spending time on event logistics instead of their actual work.
I ran into this while organizing retreats at a previous company. Before TeamOut, I worked as an AI researcher at IBM on NLP and machine learning systems. Sitting inside long email threads and cost spreadsheets, it did not look like a marketplace gap to me. It looked like a reasoning and state management problem. As large language models improved at multi-step reasoning and tool use, it became realistic to automate the coordination layer itself.
Our Solution: The core agent relies on a combination of models such as Gemini, Claude, and GPT. A central LLM-based agent maintains planning context across turns and decides which specialized tool to call next. Each tool has a specific responsibility: - Venue search and filtering - Cost estimations (accommodation + flights) - Budget comparisons - Quote and outreach flows - Communication tool with our team
For venue recommendations across more than 10,000 venues, we do not rely purely on the language model. We embed both user requirements and venues into vector representations and retrieve candidates using similarity search. Hard constraints such as capacity and dates are applied first, and results are ranked before being presented.
On the interface side, we use a split layout: conversation on the left and structured results on the right. As you refine the plan in chat, the event updates in real time, allowing an iterative workflow rather than a static search experience.
What is different is that we treat event planning as a stateful coordination problem rather than a one-shot search query. The agent orchestrates tools, manages evolving constraints, and surfaces trade-offs explicitly. It does not invent venues or fabricate pricing, and it is not designed to replace human planners for very large or highly customized events.
We make money from commissions on venue bookings. It is free for teams to explore options and plan. If you’ve organized an offsite or large meetup before, I’d genuinely value your perspective. Where would you expect this to fail? What edge cases are we underestimating? Where wouldn’t you trust an agent to handle the details?
My engineering team and I will be here all day to answer questions, happy to go deep on architecture, tradeoffs, and lessons learned. We’d really appreciate your candid feedback.