tweek.ninja

Full-stack developer

AI product backends, data pipelines, and web interfaces built for production at scale.

I design and build AI‑driven platforms—from backend services and data pipelines to the interfaces that deliver them.

My work focuses on building and maintaining production systems that stay stable over time.

↓ Scroll to view projects

RAG & streaming chat built for scale

Talkie

End-to-end RAG & streaming chat platform engineered for reliable performance at scale.

1
GEN search
Pure LLM responses without retrieval — no document context involved.
2
File upload & auto-indexing
Documents are uploaded, chunked, embedded, and stored in the vector index.
3
RAG search
The same query, now grounded with retrieved context and citations.

Why I built this:
Many RAG examples look impressive but fall apart once real load, concurrency, or background indexing enter the picture.
Talkie was built to stay fast and reliable long after the demo ends.

Design focus:
Clean separation between Gateway, Workers, and Infra so chat, ingest, and retrieval can scale independently while staying observable.

What makes this hard:
Keeping ingest → index → retrieval → generation reliable under load, while streaming tokens in real time and staying vendor-agnostic at the LLM layer.

Made the LLM pipeline explicitly measurable, enabling end-to-end p95/p99 latency and burst drain metrics to be consistently observed under bounded concurrency and fixed test conditions.

source

Real‑time agent observability & replay

Talkie-Lab

Agent-first LLM lab for debugging, replay, and real-time observability.

Real-time streaming agent chat
Session-aware agent state tracking
Prompt experiments with version history

Why I built this: Most LLM demos stop at “it works.” I wanted to understand why it fails.

Design focus: Agents must be observable, replayable, and debuggable in real time — not guessed from logs.

What makes this hard: Streaming tokens, agent state, and event logs all have to stay perfectly in sync across async boundaries to make replay and debugging trustworthy.

source

About

A full‑stack developer who builds and operates end‑to‑end systems, covering backend architecture, frontend interfaces, and deployment workflows.

The work centers on designing systems that remain reliable under real‑world conditions, with an emphasis on clarity, maintainability, and operational stability.

Current focus includes building systems that behave reliably under production‑like traffic, constraints, and iterative refinement.

I focus on building systems that can be operated in real-world conditions, not just demonstrated as prototypes. My work style emphasizes reliability, clarity, and long-term maintainability.

How I Work

Start with a working prototype to validate assumptions quickly
Stabilize systems through real usage and iterative refinement
Design for debuggability and operational visibility
Document decisions to reduce long-term cognitive load

What I Usually Do

System design and architecture
Full-stack development
Infrastructure and deployment
Tooling, debugging, and performance optimization

Focus & Interests

Event‑driven and streaming architectures (Kafka, Redis, background workers)
RAG and LLM‑powered systems that can be observed, debugged, and iterated safely
Operational tooling: dashboards, metrics, and internal utilities that make systems easier to run
Designing APIs and interfaces that stay maintainable as systems grow

Hi, I'm tweek

Talkie

Talkie-Lab

How I Work

What I Usually Do

Focus & Interests