---
url: https://lettuceai.app/changelog
title: "Changelog — LettuceAI"
description: "Track updates, improvements, and fixes across LettuceAI releases."
---

Changelog

# What's new

Track updates, improvements, and fixes across LettuceAI releases.

Download Latest Version 

Jump to release21 versions

-   [
    
    2.0.0 / 2.0.0
    
    Jun 27, 2026
    
    ](#v-2.0.0-2.0.0)
-   [
    
    1.9.0 / 1.6.0
    
    May 31, 2026
    
    ](#v-1.9.0-1.6.0)
-   [
    
    1.8.2 / 1.5.2
    
    May 23, 2026
    
    ](#v-1.8.2-1.5.2)
-   [
    
    1.8.1 / 1.5.1
    
    May 20, 2026
    
    ](#v-1.8.1-1.5.1)
-   [
    
    1.8.0 / 1.5.0
    
    May 18, 2026
    
    ](#v-1.8.0-1.5.0)
-   [
    
    1.7.2 / 1.4.1
    
    May 11, 2026
    
    ](#v-1.7.2-1.4.1)
-   [
    
    1.7.0 / 1.4.0
    
    May 11, 2026
    
    ](#v-1.7.0-1.4.0)
-   [
    
    1.6.0 / 1.3.0
    
    May 4, 2026
    
    ](#v-1.6.0-1.3.0)
-   [
    
    1.5.1 / 1.2.1
    
    Apr 13, 2026
    
    ](#v-1.5.1-1.2.1)
-   [
    
    1.5.0 / 1.2.0
    
    Apr 13, 2026
    
    ](#v-1.5.0-1.2.0)
-   [
    
    1.4.1 / 1.1.1
    
    Apr 6, 2026
    
    ](#v-1.4.1-1.1.1)
-   [
    
    1.4.0 / 1.1.0
    
    Apr 6, 2026
    
    ](#v-1.4.0-1.1.0)
-   [
    
    1.3.3 / 1.0.3
    
    Mar 29, 2026
    
    ](#v-1.3.3-1.0.3)
-   [
    
    1.3.2 / 1.0.2
    
    Mar 27, 2026
    
    ](#v-1.3.2-1.0.2)
-   [
    
    1.3.1 / 1.0.1
    
    Mar 23, 2026
    
    ](#v-1.3.1-1.0.1)
-   [
    
    1.3.0 / release
    
    Mar 22, 2026
    
    ](#v-1.3.0-desktop)
-   [
    
    1.2.0 / Beta 4
    
    Feb 15, 2026
    
    ](#v-1.2.0-beta4)
-   [
    
    1.1.0 / Beta 3
    
    Jan 31, 2026
    
    ](#v-1.1.0-beta3)
-   [
    
    Release / Beta 2
    
    Jan 4, 2026
    
    ](#v-android-release-beta2)
-   [
    
    v1.0-beta.6.2
    
    Dec 24, 2025
    
    ](#v-beta-6.2)
-   [
    
    v1.0-beta.6
    
    Dec 21, 2025
    
    ](#v-beta-6)

Releases

2026

-   [
    
    2.0.0 / 2.0.0
    
    Jun 27
    
    ](#v-2.0.0-2.0.0)
-   [
    
    1.9.0 / 1.6.0
    
    May 31
    
    ](#v-1.9.0-1.6.0)
-   [
    
    1.8.2 / 1.5.2
    
    May 23
    
    ](#v-1.8.2-1.5.2)
-   [
    
    1.8.1 / 1.5.1
    
    May 20
    
    ](#v-1.8.1-1.5.1)
-   [
    
    1.8.0 / 1.5.0
    
    May 18
    
    ](#v-1.8.0-1.5.0)
-   [
    
    1.7.2 / 1.4.1
    
    May 11
    
    ](#v-1.7.2-1.4.1)
-   [
    
    1.7.0 / 1.4.0
    
    May 11
    
    ](#v-1.7.0-1.4.0)
-   [
    
    1.6.0 / 1.3.0
    
    May 4
    
    ](#v-1.6.0-1.3.0)
-   [
    
    1.5.1 / 1.2.1
    
    Apr 13
    
    ](#v-1.5.1-1.2.1)
-   [
    
    1.5.0 / 1.2.0
    
    Apr 13
    
    ](#v-1.5.0-1.2.0)
-   [
    
    1.4.1 / 1.1.1
    
    Apr 6
    
    ](#v-1.4.1-1.1.1)
-   [
    
    1.4.0 / 1.1.0
    
    Apr 6
    
    ](#v-1.4.0-1.1.0)
-   [
    
    1.3.3 / 1.0.3
    
    Mar 29
    
    ](#v-1.3.3-1.0.3)
-   [
    
    1.3.2 / 1.0.2
    
    Mar 27
    
    ](#v-1.3.2-1.0.2)
-   [
    
    1.3.1 / 1.0.1
    
    Mar 23
    
    ](#v-1.3.1-1.0.1)
-   [
    
    1.3.0 / release
    
    Mar 22
    
    ](#v-1.3.0-desktop)
-   [
    
    1.2.0 / Beta 4
    
    Feb 15
    
    ](#v-1.2.0-beta4)
-   [
    
    1.1.0 / Beta 3
    
    Jan 31
    
    ](#v-1.1.0-beta3)
-   [
    
    Release / Beta 2
    
    Jan 4
    
    ](#v-android-release-beta2)

2025

-   [
    
    v1.0-beta.6.2
    
    Dec 24
    
    ](#v-beta-6.2)
-   [
    
    v1.0-beta.6
    
    Dec 21
    
    ](#v-beta-6)

Android · Desktop

2.0.0 / 2.0.0

June 27, 2026

## 2.0 — Living Companion Souls, Director Group Chats, Time Awareness & a Reinvented Desktop

The largest LettuceAI release yet. Companions gain a living "soul" that grows from your conversations, a bipolar relationship model where warmth is earned, and full time awareness with an in-app clock you can override. Group chats get a hands-on Director mode, a participants bar, per-group appearance, and message search. The desktop app gets a custom title bar, rounded corners, and frameless resize. Add MTP speculative decoding and the XTC sampler for local models, a guided bring-your-own-key onboarding, branch-tree navigation, two new providers, full localization, and a long tail of reliability work.

**Companion Souls** — companions now have a persistent "soul" that the app writes and evolves over time: traits, backstory, appearance, goals, likes, and fears, each on its own mutability tier. A growth-cycle engine grows and supersedes the soul from your new memories, with a consolidation pass to keep it coherent

**Earned relationships** — closeness, trust, and affection are now bipolar (they can go negative), shown as center-origin meters, and updated with a leaky, asymmetric, saturating model so warmth is earned slowly and erodes naturally instead of snapping to a number

**Time awareness** — give a companion a sense of _now_: override the in-chat clock (freeze it, set a custom time, or let it tick), with a custom in-app date/time picker, a time widget, absolute-dated memories, and live "relative time" on memories

**Director group chats** — tap a character's avatar to choose who replies next; Director mode drives the send button, supports action and cue styles, a configurable hint position, and a sticky last-pick, alongside a new participants bar with per-character mention, mute, and appearance controls

**A real desktop app** — a custom title bar with selectable designs, position and size, rounded window corners, and edge-resize handles for the frameless window

**Faster local models** — MTP speculative decoding with bundled and external draft models (including Gemma 4 shared-assistant drafters), a Full SWA Cache toggle, and the new XTC sampler

**Guided setup** — a redesigned onboarding that explains bring-your-own-key with a plain-language car metaphor, a free-vs-paid choice, and screenshot-driven Gemini and OpenRouter flows, finishing with a one-tap embedding-memory download

Companion & Soul

-   New **soul writer** that composes and updates a companion's soul, now aware of the full block set: traits, backstory, appearance, goals, likes, and **fears**
-   **Growth-cycle engine** evolves the soul from newly formed memories, with conflicting or outdated growth automatically superseded
-   **Soul tiers**, including a very-slow tier with an overlay-rendered core, so deep identity changes only slowly
-   **Soul consolidation** pass that merges and tidies accumulated growth
-   **Soul growth viewer** to review what changed, clear it, or delete individual entries (with a confirm step and no jarring full-page reload)
-   Live output and an abort control while the soul generator runs
-   Relationships widened to a **bipolar range** in the schema, with negative baselines allowed in the soul writer and worded prompt bands describing the current state to the model
-   Fresh relationships now start cooler, so the model has somewhere to grow
-   Decreasing relationship values render in a danger color for at-a-glance feedback
-   Dynamic memories can now **supersede** older, outdated ones instead of piling up

Time Awareness

-   **Effective-now time override** feeds a chosen clock into prompt time, temporal memory queries, and memory stamps — companion temporal filtering only applies when time awareness is enabled
-   Time-override controls in chat settings and the appearance drawer, plus a **time widget** to display and override the companion clock
-   A custom **in-app date/time picker** replaces the native control: typeable hour/minute, a clickable month/year selector, wrapping time steppers, and a max year raised to the JS date ceiling
-   Memories show **live relative time**, are instructed to store **absolute dates**, and can have their date set or cleared with the picker
-   Transcript messages are stamped with effective-frame timestamps, and any timestamps the model echoes back are stripped automatically

Group Chats

-   **Director speaker mode**: tap an avatar to choose who responds; the selection drives the send button (no separate confirm/cancel), with action/cue styles, a default cue style, a configurable hint position (top/bottom/hidden), a selection animation, a sticky last pick, and a wiggle nudge if you send with nobody selected
-   **Participants bar** with per-character mention toggle, mute, and appearance controls, plus avatar-shape options and a solid/fading/transparent bar background
-   **Per-group chat appearance**: a dedicated editor (desktop drawer + mobile page) backed by group-level appearance override storage
-   **Message search** with a header button and jump-to-message that loads older messages from the database when needed
-   **Per-session author notes** with an inline editor and prompt injection
-   **Chat widgets on desktop** for group chats: render and edit the widget area, fed real group data, with a per-widget character picker
-   **Dynamic memory references** captured and displayed per message
-   Group session settings open as a desktop side drawer; create and settings pages redesigned with a two-column layout and footer CTA; headers and avatars unified under a shared option-row design
-   Every group session is now connected to a source group, with orphans backfilled

Desktop Experience

-   **Custom title bar** with selectable designs, position, and size
-   **Rounded window corners** (with WebKitGTK repaint workarounds) and **edge-resize handles** for the frameless window
-   Fullscreen overlays are correctly offset below the custom title bar, and the corner toggle is guarded across decoration changes
-   Inline desktop search on Discovery with skeleton parity, tag search, a "pure mode" blur, and infinite scroll

Local Models (llama.cpp)

-   **MTP speculative decoding** with bundled and external draft models, support for Gemma 4 shared-assistant drafters, and early-stop when draft confidence drops
-   **Full SWA Cache (swa-full)** toggle per model
-   New **XTC (Exclude Top Choices)** sampler with probability/threshold controls, off by default
-   Switched to a maintained llama-cpp-rs fork (b9611) for MTP support
-   Memory planning now accounts for sidecar memory, and MTP fields are forwarded through the provider extra-body allowlist (including in group chats)
-   Linked sidecar installs are grouped into a single download card

Branching

-   **Branch-tree navigation** with lineage view, branch comparison, and a fork marker
-   A parent-branch confirm menu and child-fork indicators in chat, with the in-session branch-point id persisted so the marker renders reliably
-   Corrected active-path line coloring with a stable tree layout

Onboarding

-   Guided **bring-your-own-key** setup with a plain-language car metaphor, a free/paid choice, and screenshot-driven Gemini and OpenRouter flows
-   Guided **embedding-memory finish** step with a one-tap download and a handoff into the in-app tour
-   Bottom-anchored mobile welcome migrated to Tailwind, a horizontal guided carousel strip, a welcome card in the getting-started flow, and secondary links collapsed into an "Other options" menu
-   Harder warnings on skip, and character creation is gated until at least one model exists

Providers & Memory

-   New **Gemini Agent Platform (Express)** provider
-   New **LiteRouter** provider
-   Dynamic memory **run modes**: ask-first approval menu and a manual-gating setting, so memory updates can require your confirmation
-   Bulk update of protected prompts to the latest versions with full auto-refresh coverage and reset logging

Audio & Scenes

-   **Audio input**: upload, play back, and track token usage for audio in chat
-   New **Audio library** tab listing TTS and chat-uploaded audio with player cards and a per-card actions menu (open in chat, download, delete)
-   **Inline scene images** rendered in chat, with image/GIF insert in the scene editors and an in-bubble indicator while a scene image prompt streams; inline image tokens are stripped from API requests

Customization & Appearance

-   Accessibility settings reorganized into a **Customization** page
-   Custom **chat input color** with adaptive contrast
-   Resizable appearance drawer with a compact clear-overrides control and collapsible, animated sections
-   **Dice roll** added to the plus menu with editable notation
-   Chat-appearance and time-override actions surfaced from the group chat message menu and chat footer

Models & Model Browser

-   **Author profiles** in the model browser, with an author filter and in-profile search
-   Configurable **model folders** with atomic migration, plus a local runtime-defaults page for llama.cpp defaults, matched to the advanced-settings design
-   Runnability scoring now accounts for MTP next-n layers and QAT quants, and recommended KV cache is capped at Q8
-   "Go to Models" shortcut and an Image Generation top-nav title; warnings before deleting a prompt or model that is in use

Voices, Docs & Localization

-   **Kokoro** model setup moved to Voice providers with a guided download menu; a stale Kokoro asset root now self-heals from synced platforms
-   Documentation links surfaced across the app with a completed docs map, translated into every locale
-   Frontend user-facing strings routed through the locale system, with all locales synced and new translation tooling — XTC, branch tree, companion setup, scenes, memory, audio, and more translated across ~20 languages

Reliability & Fixes

-   Companion: a turn save can no longer clobber concurrently-changed time preferences; time override is preserved across state round-trips and kept synced to the live clock (no more empty field or ticking jump-back); models no longer echo system timestamps back into replies
-   Memory: partial vector migrations are persisted so an aborted re-embed stops looping; embedding failures are now visible in the logs instead of failing silently
-   Group chat: fixed a panic on a non-char-boundary log preview slice; "continue" no longer impersonates the persona or duplicates the last user turn; a null chat-appearance no longer breaks group parsing or hides the appearance button; saving appearance settings no longer reloads the whole chat
-   Image: fixed a doubled API version in the Gemini image URL; the image viewer centers correctly when there is no prompt
-   Backup: Android backups are written to Downloads via MediaStore and indexed for list/delete, with the real export/import error surfaced instead of a generic one
-   Settings: `forceSendThinkingState` persists correctly; chat appearance and widget data are preserved when editing a character; sidebar provider/model counts refresh on in-app changes
-   Builds: Nix flake added; Windows Vulkan/CUDA, Linux SPIRV-Headers, long-path, and PR-check path issues fixed; macOS bundles use ad-hoc signing with Gatekeeper instructions in the README

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.9.0 / 1.6.0

May 31, 2026

## Chat Widgets, a Live-Preview Appearance Drawer & Companion Memory Tools

A big chat-customization release. It introduces a full Chat Widgets system for building a custom side panel next to your conversation, a redesigned side-anchored appearance drawer with live preview and desktop column controls, hands-on companion memory tooling with manual processing and live output, a flatter rebuilt model editor, plus smarter local-model thinking, anti-loop dynamic memory, and a long tail of reliability fixes.

What's New

-   **Chat Widgets**: build a custom panel beside your conversation from composable widgets, including character and persona info, scratch pad, image, stat tracker, memory, companion state, quick snippets, dice, session info, author note, and layout pieces (divider, box, selector, button). Edit in place with a sticky toolbar, drag-to-reorder, an Add-widget picker, per-widget design variants (default, minimal, solid, outline), a real library image picker, and cross-column moves. Widget layouts are saved per character
-   **Live-preview appearance drawer**: chat appearance moved into a side-anchored drawer you can open from the chat header, with a live preview that updates as you tweak, plus a tabbed shared form, side-flip, and a message-actions entry
-   **Desktop chat layout controls**: set the chat column width, alignment, and full-shell behavior, split the shell into independent header and footer toggles, add a center widget mode, and resize the widget area with a draggable divider. Group chats mirror the same settings
-   **Companion memory tools**: trigger a memory-processing cycle manually, watch it run with a progress bar and live output viewer (with cancel), and review and edit the generated context summary inline in a new editor card
-   **Redesigned model editor**: a flatter, box-free layout with unified section tabs and a runtime-report drawer, width-aware on desktop and clean on mobile
-   **Smarter local-model thinking**: a force-send thinking-state toggle for local models and recognition of Gemma channel-style reasoning tags
-   **Anti-loop dynamic memory**: adjusted sampling to reduce repetition loops, with live visibility into generation
-   **Per-message info**: optionally show the model that generated each message, input/output/total token counts, time-to-first-token, and tokens/sec, with an independent toggle for each, a choice of placement (below the header inside or outside the bubble, inside the bubble, or below it), and a text-size option

Improvements

-   Added optional author name and timestamp headers above messages
-   Added support for image-only OpenRouter models
-   Made the Help me Reply history window configurable so it can look back as far as you want
-   Companion relationship meters now show low and high anchor labels for context
-   The scroll-to-bottom button tracks the composer height as it grows and anchors to the messages column when widgets are shown
-   Sharpened local-model runnability scoring with MoE active-path awareness, an expanded quantization table, KV cache quant types, and a repaired GGUF parser
-   Added a direct Save action to the unsaved-changes toast
-   Local-model performance metrics (time-to-first-token and tokens/sec) are now saved with each message and shown in message details after a reload, in both direct and group chats
-   Chat background blur is applied to the image directly, dropping the separate bubble-blur control for a cleaner result
-   The chat settings drawer now saves and updates the session immediately after changing a value
-   Added a seeded companion benchmark generator for repeatable 20-message test chats

Fixes

-   Streaming messages now apply the chat appearance settings (such as the author name and timestamp header) while generating, instead of only after the message finishes
-   Fixed identity placeholders leaking into injected memories, lorebook entries, and summaries
-   Fixed the model selector's "only free models" toggle colliding with the title on mobile
-   Companion memory now allows companion categories on edit and stops placeholder leakage
-   llama.cpp drops the existing model before reload, avoiding double-pinned VRAM
-   Local models now load more reliably with GPU offload. Context sizing accounts for layers offloaded to the GPU, so a model that runs fine with mixed CPU and GPU offload is no longer wrongly reported as too big to fit
-   Improved llama.cpp VRAM headroom estimates so context creation no longer fails with out-of-memory on partially offloaded models. The compute-buffer reserve is derived from the model's dimensions and batch size, and a context that hits OOM is retried at a smaller size even when a KV cache type is set
-   Cleaned up orphaned memory embeddings and repaired the embeddings migration
-   Made the speech-recognition migration idempotent
-   The creation helper can now use llama.cpp models
-   Settings are now reloaded after successful syncs
-   The reset flow removes Whisper and Kokoro models
-   The group chat memories page now renders properly when a chat background image is used

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.8.2 / 1.5.2

May 23, 2026

## In-App Help, Friendly HTTP Errors & Character Design References

A polish release that adds a proper in-app Help & FAQ for BYOK and setup questions, introduces design references on the character create flow so scene generation stays on-model, translates raw provider HTTP errors into friendly explainers across chats and group chats, and cleans up a long tail of navigation, sync, and onboarding fixes.

What's New

-   New in-app **Help & FAQ** page covering BYOK, API keys, free vs paid providers, tokens, privacy, and common setup questions, with a shortcut from onboarding so new users aren't dropped in cold
-   Added **design references** (visual description + reference images) to the character create flow, so scene generation stays on-model from the very first chat
-   Unified the **local model requirements prompt** across sync onboarding so the embedding check is presented consistently

Improvements

-   HTTP errors in chats and group chats now get a friendly explainer that names the problem (rate limit, out of credits, model not found, content blocked, provider down, etc.), suggests a fix, and keeps the raw error one tap away
-   Aligned the reasoning header and toggle styling for a cleaner message presentation
-   Redirected the Settings _Convert Files_ entry to lettuceai.app/convert

Fixes

-   Fixed the chat templates back arrow looping between Templates and Settings — the editor now returns to the template list, and the list returns to character edit
-   Stopped the banner avatar picker from overflowing narrow containers
-   Restored missing chat template options on mobile
-   Corrected the misleading "Continue to Starting Scenes" button label shown while already on the Starting Scenes step
-   Defaulted the summarisation model correctly during onboarding and runtime so dynamic memory works out of the box
-   Applied the user's custom TLS trust store to image-generation requests so self-signed endpoints work like the rest of the app
-   Surfaced the embedding-model prompt after sync completes when the local model requirement isn't met
-   Reapplied the sync `Ready` handshake and post-sync completion fixes that were lost in a previous merge
-   Respected the device safe area inside the what's-new drawer so content no longer sits under the notch or home indicator

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.8.1 / 1.5.1

May 20, 2026

## Model Loading, Import & Memory Migration Polish

A stability and UX polish release that quiets false GPU-optimal warnings when KV cache lives in RAM, fixes Android chat import from document-picker URIs, unblocks dynamic memory vector migration from a stuck startup toast, surfaces ETA on download queue cards, and refreshes chat settings drawers so quick-settings no longer reopen stale.

Summary

Changes

-   Fixed Model Browser GPU-optimal warnings so they no longer appear when KV cache is explicitly placed in RAM.
-   Fixed Android chat import to support document picker `content://` URIs for JSONL chat files.
-   Fixed dynamic memory vector migration getting stuck behind a permanent startup toast by adding timeout and failure handling.
-   Added ETA display to model download queue cards and fixed missing download queue locale strings.
-   Fixed llama startup toasts appearing for models that were already loaded and reused.
-   Fixed chat settings drawers reopening with stale quick-setting values by refreshing the latest saved character by ID.

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.8.0 / 1.5.0

May 18, 2026

## Sync Onboarding, Banner Cards & Shared Memory

A feature-heavy release that adds device-to-device sync onboarding with readiness checks, introduces a new persisted banner card system for characters, expands companion memory with shared state and scheduled notes, broadens provider and TTS coverage, and hardens migrations, imports, backups, and chat appearance persistence.

Added a new "Sync from another device" onboarding flow with embedding checks and a readiness handshake before transfer begins

Expanded companion memory with shared cross-session memory, scheduled notes, custom summarizer and memory-manager prompts, and character-driven time-awareness defaults

Added Fish cloud/local audio TTS providers plus Cerebras and Pollinations provider integrations

Added a new `banner` character card system with its own persisted card type, banner media asset, crop state, editor controls, and chat-list rendering path

Reworked major settings screens, including desktop-first provider layouts, prompt editors, chat appearance tooling, and reusable numeric input controls

Replaced `chatpkg` zip export with SillyTavern JSONL and preserved banner crop metadata across imports and transfers

Hardened storage and migrations with repairs for missing character columns, safer invalid-character handling, and backup/sync preservation for memory embeddings and advanced settings

Fixed chat appearance persistence and background styling regressions, including stale override saves and inconsistent preview/render behavior

Sync, Onboarding & Data Reliability

-   Onboarding now includes a dedicated "Sync from another device" flow that checks embedding readiness before sync starts and guides the handoff more clearly
-   Sync adds a `Ready` handshake, stronger failure handling on passenger apply, disconnect cleanup for stale host approval state, and more stable progress and asset-transfer lifecycle reporting
-   Backups and sync now preserve `memory_embeddings` as the source-of-truth dataset and retain advanced settings that were previously at risk of being dropped
-   Character loading skips invalid records instead of failing broader list reads, reducing the impact of bad local data
-   Startup and migration repair paths now backfill newer `characters` columns, including banner crop fields, to recover databases whose schema drifted out of sync with migration state

Character Cards, Companion, Chat & Settings

-   Characters now support a first-class `banner` card type rather than only the older circular card presentation
-   The banner system adds a persisted `cardType` field plus separate `bannerCrop` state, schema support, storage columns, migration coverage, and startup repair logic for older databases
-   Character creation and editing now expose explicit card-type switching, dedicated banner image picking, banner-specific cropping, and fallback behavior to the base avatar when no separate banner image is set
-   The chats surface gained a separate `BannerCharacterCard` rendering path and banner-aware avatar handling instead of reusing the old circular card UI
-   Imports and character transfers now preserve banner crop metadata so the new card design survives sync and package movement intact
-   Companion memory now supports shared memory across sessions, scheduled notes storage and editing, lorebook conditions tied to those notes, and prompt injection for scheduled-note context
-   Characters can initialize companion time awareness from their defaults, reducing manual setup for time-aware flows
-   Chat appearance customization was substantially reworked with a better desktop layout, preview relocation, transparent-header control, and fixes for background dimming, blur, bubble opacity previewing, and per-character override persistence
-   The providers page, provider editor, system prompt editor, and prompt entry cards were reorganized into cleaner desktop-first layouts with reusable field primitives and a new `NumberInput` control
-   Library and chats load faster through deferred gradients, cached list state, lazy avatar loading, and general page-load optimizations

Providers, Media & Platform Fixes

-   Added Cerebras AI and Pollinations AI providers
-   Added Fish TTS support for both cloud and local audio flows
-   Android debug APKs no longer force JNI debug symbols, reducing debug-build baggage on the mobile side
-   Post-update discovery improved with a new "What's New" drawer, and onboarding/welcome flows were redesigned into a more unified first-run experience

Fixes & Stability

-   Chat appearance saves now persist only the intended per-character override instead of reintroducing stale old values after reset-and-save flows
-   Chat backgrounds now honor configured styling more consistently, including footer/header transparency behavior and preview parity for blur and opacity controls
-   Import/export and storage repairs reduce the chance that banner assets, banner crop state, or newer character fields disappear during transfers or after upgrades
-   Sync and onboarding flows now fail earlier and more clearly when prerequisites are missing, instead of allowing half-complete transfers to proceed

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.7.2 / 1.4.1

May 11, 2026

## Provider Leak Fixes, Groq Compatibility & Linux Packaging

A hotfix release that plugs internal request metadata from leaking to providers, fixes Groq's model listing and ships its logo, restores Android TLS by reverting to bundled rustls roots, re-enables text tool-call parsing on mobile, and lights up Linux distribution via Flatpak, AUR, and Debian repo publishing workflows.

> This release rolls up the Android 1.7.1 TLS revert alongside the Android 1.7.2 mobile fixes and the Desktop 1.4.1 cross-cutting changes, so a single changelog covers all three hotfixes shipped after the 1.7.0 / 1.4.0 launch.

Stopped internal request metadata (memory tool envelopes, debug fields, retry hints) from leaking into provider payloads

Stripped visible chat metadata from outbound provider messages so model context stays clean across every transport path

Fixed Groq's model listing to use the OpenAI-compatible `/openai/v1/models` endpoint and added the Groq provider logo

Restored Android networking by reverting reqwest to the bundled rustls-tls roots

Re-enabled text-based tool-call parsing on mobile builds

Added Linux package publishing workflows for Flatpak, AUR, and a Debian repository

API & Provider Fixes (Android & Desktop)

-   The request builder now scrubs internal metadata (memory tool envelopes, debug hints, retry bookkeeping) from every outgoing provider message instead of relying on per-provider strippers
-   Visible chat metadata is stripped from every provider message regardless of provider, so the same scrubbing path applies whether the call goes through chat completions, continuation, regenerate, or reply-helper
-   Groq's model discovery now calls the OpenAI-compatible `/openai/v1/models` endpoint so the model picker actually populates instead of failing silently
-   The Groq provider gained a proper logo in the provider icon map and the model selector

Android (1.7.1 + 1.7.2)

-   Reverted reqwest's TLS backend to the bundled rustls-tls roots so HTTPS works on devices where the system trust store rejects requests (Android 1.7.1)
-   Removed the mobile-only short-circuit that disabled text tool-call parsing so models that emit tool calls as plain text (rather than native tool blocks) work on mobile again (Android 1.7.2)

Desktop (1.4.1)

-   Added a Linux release pipeline that builds and publishes Flatpak, AUR, and a Debian package repository alongside the existing Tauri bundles
-   New scripts package a tarball for AUR consumers, prepare the AUR PKGBUILD, and publish the Debian repo
-   The desktop release workflow now wires these jobs in so a single tag push produces all Linux artifacts

Fixes & Stability

-   Provider message construction is now centralized so metadata-scrubbing fixes cannot regress per-provider
-   Mobile TLS no longer depends on system roots being present, which prevented HTTPS on stripped-down Android builds
-   Mobile tool-call parsing matches desktop, so prompt-only tool flows behave consistently across platforms

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.7.0 / 1.4.0

May 11, 2026

## Agentic Smart Creator, Lorebook Generator & Redesigned Settings

This release rebuilds the Smart Creator and lorebook generator as true tool-calling agents, ships a redesigned Settings, Usage Analytics, and Installed Models experience with persistent sidebars and dense tables, and hardens self-hosted setups with trusted-cert support and native OS trust roots, alongside companion temporal awareness, persona-attached lorebooks, llama.cpp DRY sampler controls, and a deeper avatar pipeline.

Rebuilt the Smart Creator (character and persona Creation Helper) as a true tool-calling agent with a planner, tool executor, persistent state machine, structured-fallback parser for non-tool models, inline pills, image gallery, and a "save and chat" handoff

Added an agentic lorebook generator with a multi-stage planner → writer → refine → coherence pipeline, persistent draft storage, and a guided full-flow page reachable from the Create menu

Redesigned Settings around a persistent collapsible sidebar that survives navigation, merged Lorebooks and Companions into hub pages, and refreshed the About and Reset pages

Redesigned Usage Analytics and the activity log with a dense table layout, KPI strip with tabular numerals, ranked bar lists in place of pie charts, and a real desktop grid table for recent activity

Added a Hugging Face browser → Ollama destination flow with NDJSON-streamed remote pulls, remote-inventory listing and deletion, three result view modes, pipeline-tag chips, and parameter-size presets

Added a Security flow for importing trusted PEM certificates plus a per-provider "allow invalid TLS" toggle, and switched the TLS stack to native OS trust roots so system-installed certs are honored

Personas can now attach lorebooks, with the relationship threaded through completion, continuation, regenerate, and reply-helper flows and preserved across backup, sync, entity transfer, and import

Added a session-relative temporal layer to the companion memory pipeline so recall and stamping understand session time, with new per-session settings UI and developer diagnostics

Image generation requests now carry the originating session, character, and a sub-flow tag so usage records are attributable to the actual chat instead of collapsing into one synthetic "Image Generation" entry

Added DRY (Don't Repeat Yourself) sampler controls for llama.cpp, wired all the way from the runtime through provider configuration, schema, the model editor, the controller hook, and the per-session advanced-settings panel

What's New

-   The Smart Creator now runs a single goal-scoped agent against a per-tool allowlist from settings, with the smart-vs-manual split removed and the lorebook flow split out into its own surface
-   "Create new" no longer auto-resumes incomplete sessions; "continue last" is now a separate explicit action
-   The lorebook generator runs as planner → writer → refine → coherence stages, persists drafts, and exposes a guided multi-step UI as well as a settings entry point
-   A new reusable single-character selector is shared between the lorebook generator and group-chat settings
-   Personas gained an active-lorebooks list that flows through every reply path and round-trips through backup, sync, entity transfer, and import
-   The companion memory flow was substantially rewritten to consume temporal metadata, with prompt engine, parameter engine, and entry conditions all gaining temporal hooks
-   The HF model browser can push downloads to a remote Ollama provider, streaming `/api/pull` NDJSON status into the shared download queue with cancel and progress
-   The HF browser exposes remote Ollama inventory with first-class listing and deletion of remote models
-   The HF result list gained list / grid / gallery view modes with a sliding indicator on the sort bar, persisted across sessions and wired into the layout toggle in the top navigation
-   The HF filter sheet now has pipeline-tag chips, min/max parameter inputs, and quick presets (≤3B, 3-8B, 8-14B, 14-34B, ≥34B) with an active-count badge
-   The Installed Models page got a Local / Ollama tab switcher with prefetched per-tab counts, a flat (non-card) table look, dense desktop grid with mobile pill collapse, and icon-only Copy / Delete row actions
-   A new Security flow lets users import trusted PEM certificates which are applied to every outbound HTTP client, including provider verification, generic API transport, and Ollama
-   Each provider gained an "allow invalid TLS" toggle scoped to matching configured base URLs, so self-hosted endpoints can be excepted without globally relaxing TLS
-   The TLS stack was switched from bundled webpki roots to native OS trust roots so certificates added to the system trust store are honored
-   Llama.cpp DRY sampler controls were added across the runtime, provider configuration, schema, model editor UI, model controller hook, and per-session advanced-settings panel
-   Providers can now declare their own extra-body keys, which the request builder honors instead of hard-coding per provider
-   The chat message debug page can show the raw outgoing body per retry attempt
-   Image generation requests can carry session, character, and sub-flow tags; scene-mode generations populate them automatically and creation-helper image calls are tagged accordingly
-   The Usage Activity page gained operation-type filter chips, search across character / model / provider / op, and inline pagination
-   The Usage detail sheet was tightened: meta chips for time / provider / finish reason, dense token-usage tiles that hide zero rows, and a highlighted total cost
-   The Settings shell now uses a persistent collapsible sidebar that survives navigation, with Lorebooks and Companions merged into hub pages
-   The avatar pipeline gained source selection (base vs cropped), MMCQ palette quantization, manual gradient regeneration with cache invalidation, and proper persistence of cropped square and round variants
-   The color picker stops re-rendering the full character editor on every color tick; updates stay local until the input is committed

Smart Creator (Agent Rebuild)

-   The Creation Helper was rebuilt as a true agent with a planner, tool executor, dedicated LLM driver, persistent state machine, and a structured-fallback parser for models without native tool calls
-   The frontend was rewritten to render inline pills, an image gallery, and a "save and chat" handoff
-   The lorebook flow was removed from Smart Creator entry points; the agent is now scoped to character and persona creation only
-   The previous smart-vs-manual split was replaced with a single goal-scoped agent that always runs against the enabled-tool allowlist from settings
-   Settings was simplified into one explicit agent control surface covering model selection, streaming, image generation, and per-tool access
-   Selecting "create new" no longer silently resumes incomplete sessions; "continue last" is a separate explicit action

Lorebook Generator

-   A new multi-stage pipeline runs planner → writer → refine → coherence with a state machine and persistent draft storage
-   The frontend gets a guided full-flow page (planning, drafting, coherence, commit) and a dedicated settings entry point
-   A reusable single-character selector is shared with group-chat settings
-   New prompt scaffolding was added for each stage so the planner, writer, refine, and coherence steps each have purpose-built prompts
-   The generator is reachable from the global Create menu and the top navigation

Settings & UI Redesign

-   Settings was rebuilt around a persistent collapsible sidebar that survives navigation
-   The About page was substantially refreshed and the Reset page was simplified
-   Lorebooks and Companions were merged into hub pages reachable from the sidebar
-   The "send as system" confirm dialog was reworked to match the bottom-menu design language used elsewhere

Usage Analytics

-   The dashboard, activity log, and shared detail sheet were rewritten with a dense table layout
-   Header tab pills (Dashboard / App Time) replace the old top toggle
-   KPI strip uses tabular numerals and a single highlighted accent tile for total cost
-   Ranked horizontal bar lists replace the pie charts for the by-model and by-character breakdowns
-   A real desktop grid table renders recent activity with hover, dense rows, and a mobile-collapsed two-line layout
-   The Activity page gained operation-type filter chips with counts, search across character / model / provider / op, and inline pagination
-   The detail sheet hides zero-value token rows and only shows non-zero extra costs, with the total cost tile highlighted
-   Operation type labels were normalized so camelCase variants from the backend (`imageGeneration`, `groupChatRegenerate`, `aICreator`) resolve to their proper translated labels instead of leaking as raw uppercase tokens

Personas & Companion Memory

-   Personas now have an active-lorebooks list that is threaded through chat completion, continuation, regenerate, and reply-helper flows
-   Backup, sync, entity transfer, and import all preserve the persona-lorebook relationship
-   The persona editor exposes a lorebook selector
-   A new temporal layer gives the companion memory recall and stamping a notion of session-relative time
-   The memory flow was substantially rewritten to consume temporal metadata, and the prompt engine, parameter engine, and entry conditions all gained temporal hooks
-   New per-session settings UI for time awareness lives in the chat settings panel; the developer page exposes diagnostics
-   Group chats and scene generation also receive the temporal context

Models, Inference & Image Generation

-   DRY sampler controls were added for llama.cpp, exposed in the model editor, the model controller hook, and the per-session advanced settings panel
-   Providers can declare their own extra-body keys, which the request builder honors instead of hard-coding per provider
-   The chat message debug page can show the raw outgoing body for each retry attempt
-   The HF model browser can push downloads to a remote Ollama provider, with NDJSON-streamed progress, remote-inventory listing, and remote-model deletion
-   HF results gained list / grid / gallery view modes, pipeline-tag filter chips, min/max parameter inputs, and quick presets (≤3B, 3-8B, 8-14B, 14-34B, ≥34B)
-   The Installed Models page now has Local / Ollama tabs with prefetched per-tab counts, a flat table look, and icon-only Copy / Delete actions
-   Image generation requests can carry the originating session, character, and a sub-flow tag (`scene`, `creation_helper`, etc.) so renders are attributable to the actual chat instead of collapsing into one synthetic "Image Generation" record
-   Standalone callers (the manual image-generation page) continue to use the existing placeholder, fully backward compatible
-   The embedding test and benchmark suites were significantly expanded for v3 vs v4 comparison work

TLS & Self-Hosted Networking

-   A new Security flow lets users import trusted PEM certificates, which are applied to every outbound HTTP client (provider verification, generic API transport, Ollama)
-   Each provider gained a per-provider "allow invalid TLS" toggle scoped to matching configured base URLs
-   The TLS stack was switched from bundled webpki roots to native OS trust roots, so certificates installed in the system trust store are honored without re-importing them in the app

Avatar Pipeline

-   Repositioning now exports both square and round variants and saves the cropped output instead of keeping the original uncropped image
-   Avatar sources are normalized to data URLs before canvas export to avoid browser security errors
-   Avatar gradients can now be sourced from either the base or the cropped image, with that choice persisted on each character
-   The dominant-color extraction was replaced with MMCQ palette quantization
-   Pixel filtering was removed so grayscale and flat-color avatars no longer fall back to the default purple gradient
-   A forced regeneration path can bypass the on-disk and in-memory caches; a compact refresh button next to the avatar-gradient toggle exposes it with loading feedback and toasts
-   The brightness/saturation boost on the gradient generator was removed so colors track the actual avatar palette
-   Custom gradient defaults seed from the detected palette instead of a hardcoded purple fallback
-   The color picker no longer re-renders the full character editor on every drag tick; updates are committed only on final input

Fixes & Stability

-   Backup and sync layers were substantially expanded so newly added schema fields no longer get dropped on round-trip
-   The V3 → V4 dynamic memory upgrade toast now only fires when the user actually has dynamic memory v3 installed, instead of showing for everyone
-   Replaced ad-hoc fallback IDs with proper UUID v4s in the character and persona creation paths
-   The model-download progress strings now resolve correctly in every locale instead of leaking raw i18n keys
-   The Companion soul settings page is now scrollable

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.6.0 / 1.3.0

May 4, 2026

## Companion Mode, New Voice Features & Embedding v4

This release introduces Companion Mode with live relationship state, companion memory surfaces, and soul authoring, while also adding Kokoro TTS, local speech recognition, and the new lettuce-emb-v4 memory model, alongside broader lorebook, runtime, and storage improvements across desktop and Android.

Added Companion Mode as a new interaction model with a dedicated relationship-oriented prompt path, authored companion soul configuration, live emotional and relationship state, and companion-specific memory and inspection pages

Added `lettuce-emb-v4` as the new embedding model for the memory layer, with a large roleplay-retrieval quality jump, 768d native embeddings, Matryoshka dimensions, and ONNX exports

Companion soul authoring is much deeper, with a full editor, presets, in-chat editing surfaces, and an AI-assisted Companion Soul Writer that can draft or refine the soul from character context

Added major new voice features through Kokoro TTS and whisper.cpp-based local speech recognition, making local speech workflows practical across platforms

Dynamic memory, storage, and session persistence received another major hardening pass, especially around normalized memory embedding storage, hot-path updates, and better consistency for long-running chats

Local runtime work continued across `llama.cpp`, offload planning, provider/model routing, and memory-related background processing, improving resilience for local and hybrid setups

What's New

-   Added Companion Mode as a separate interaction mode alongside roleplay, aimed at persistent relationship-driven chats instead of scene-first roleplay
-   Added authored companion configuration with soul fields such as essence, voice, relational style, vulnerabilities, habits, boundaries, baseline affect, and regulation style
-   Added companion relationship pages that expose live closeness, trust, affection, tension, active emotional signals, and relationship-oriented memory history
-   Added companion memory pages for browsing, editing, pinning, cooling, and pruning companion-relevant memories
-   Added in-chat companion soul editing so a companion's authored personality and relational baseline can be refined without leaving the chat context
-   Added AI-assisted companion soul generation through the Companion Soul Writer workflow
-   Added companion download/setup pages and missing-model guidance so the companion stack can be installed more intentionally instead of silently failing
-   Added `lettuce-emb-v4` as the new memory embedder, improving retrieval quality for long-running chats and roleplay memory lookups
-   Added Kokoro-related speech flows and improved voice selection/management surfaces
-   Added whisper.cpp ASR integration work so local transcription flows can participate in the app’s voice pipeline
-   Expanded lorebook workflows with more creation, generation, preview, and management tooling across character and library flows

Companion Mode (Beta)

-   Added Companion Mode as a distinct chat mode for persistent relationship-driven conversations rather than scene-first roleplay
-   Companion chats now maintain live per-session emotional and relationship state rather than only relying on static character prompts
-   The companion runtime updates state from user turns before the assistant reply is generated, so the same turn’s response can reflect evolving closeness, trust, affection, tension, and emotional regulation
-   Companion prompting now has its own template path and injects companion-state context into the prompt when needed
-   Companion setup is gated by required local models, including embedding, emotion classification, NER, and routing, which makes failures easier to understand and recover from
-   Companion-specific inspection pages make the system more legible: users can now see relationship metrics, emotional vectors, and companion-oriented memory records instead of treating the mode as a black box
-   Companion turn effects and post-turn memory plumbing were added so background memory processing can be tied back to specific turns more clearly

Voice, TTS & ASR

-   Added Kokoro TTS as a new speech-generation capability, including packaging, runtime handling, voice integration, and Android-specific eSpeak bundle work needed for reliable phoneme/voice support
-   Added whisper.cpp-based speech recognition and wired it into the broader runtime, giving the app a real local speech-input path
-   Voice management and selection behavior continued to improve across character creation and settings flows

Embedding Model v4

-   Added `lettuce-emb-v4` as the new embedding model for the memory layer, replacing the old weak roleplay-retrieval behavior with a roleplay-first embedder built for long-lived chats
-   The new model delivers a major retrieval-quality jump, with the v4 announcement reporting `0.924` recall@1 on its roleplay-memory benchmark versus `0.020` for v3
-   `lettuce-emb-v4` now uses native `768d` embeddings instead of a `512d` projected bottleneck, which removes a major quality constraint from the previous setup
-   The model supports Matryoshka slicing across `64 / 128 / 256 / 512 / 768` dimensions, so different devices can use different memory tiers without needing separate model families
-   In user terms, this is one of the biggest memory upgrades in the release: long chats, callbacks, recalled details, and roleplay continuity should all benefit from much better retrieval quality

Memory & Runtime

-   Dynamic memory storage moved further toward normalized embedded memory records instead of relying on looser legacy summary-only flows
-   Session and memory hot paths were optimized with narrower DB updates and better high-message-count consistency, reducing the cost and fragility of frequent session writes
-   Post-turn background memory scheduling became more explicit, which helps long-running chats avoid overlapping or wasteful memory work
-   Companion-mode memory now layers companion interpretation and UI on top of the shared dynamic-memory engine, which improves visibility without forking the whole memory stack
-   Local runtime work continued in model routing, provider compatibility, and `llama.cpp` behavior, helping the app tolerate more real-world local setups
-   The `llama-cpp-rs` dependency was updated again near the end of the range, keeping the vendored `llama.cpp` side current without requiring app-level API changes

Lorebooks, Creation & Content Workflows

-   Lorebook tooling expanded heavily across generation, preview, import, and management flows
-   Character creation and editing now better support companion-first authoring, including mode selection, companion prompts, and companion soul configuration
-   Companion and roleplay setup flows are now more clearly separated so users are not pushed through scene-first authoring when building a relationship-oriented character
-   Prompt-building and request-construction work continued across the chat stack, improving how character, lorebook, memory, and companion state are assembled before inference

Fixes & Stability

-   Improved session, settings, and memory persistence behavior so malformed or partial state is less likely to break the app or silently corrupt key flows
-   Hardened storage migrations and runtime state handling around newer memory and companion data structures
-   Reduced several sources of desktop runtime instability in local-model paths, especially around memory, prompting, and backend integration
-   Continued fixing packaging and setup regressions across Android and desktop speech/model flows
-   Improved internal consistency for chat branching, session copying, and related memory state carryover

Platform Notes

-   Desktop benefits most from the current companion inspection tooling because the relationship, memory, and soul pages are all deeper and easier to navigate in the desktop shell
-   Local-model runtime, offload, and memory-path hardening is especially important for desktop users running `llama.cpp`, ONNX, and mixed local/provider setups

Notable Technical Themes

-   The biggest new product feature in this release is Companion Mode: a distinct companion architecture with authored soul configuration, live relational state, and companion-specific inspection tooling
-   The voice stack now includes genuinely new product surface area, with Kokoro TTS and local speech recognition landing alongside the Android speech-packaging work needed to support them
-   The memory layer also got a major product-level upgrade through `lettuce-emb-v4`, which turns embedding quality itself into a visible feature improvement rather than only an internal model swap
-   Memory and storage work in this range focused on making long-running sessions more reliable and structurally sound instead of only adding more visible features
-   Much of the local AI work is release-hardening: better routing, better background processing, better packaging, and fewer fragile assumptions about runtime state

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.5.1 / 1.2.1

April 13, 2026

## Dynamic Memory Expansion, Local Tooling Resilience & Logging

This release heavily expands Dynamic Memory for local models, makes local tool calling and settings recovery more tolerant, improves diagnostics, and fixes several character and group chat setup regressions on desktop.

Dynamic Memory was heavily expanded for local models with a separate local memory-manager template, experimental recursive memory loops, configurable loop caps, richer lifecycle logging, and improved revert behavior

Dynamic Memory debugging is much stronger: raw cycle payloads can now be captured and inspected, and malformed local tool arguments are normalized before validation

`llama.cpp` local tool calling became more tolerant again by dropping the hard dependency on native parser metadata and falling back more gracefully

Settings and model configuration loading became more resilient, reducing cases where malformed model data could break provider visibility or onboarding flows

Rust panics now produce dedicated panic report files instead of only blending into the main app log

Character and group-chat setup regressions were fixed, including broken persistence for character group-chat prompt selections and the non-scrollable group setup page on smaller or scaled desktop displays

User-Facing Features

-   Dynamic Memory now includes an experimental `Recursive Memory Loops` mode for stepwise tool execution until the model signals completion
-   The recursive loop hard cap is configurable instead of being fixed internally
-   Dynamic Memory activity logs have improved revert UX and can reconstruct state more accurately after reverts
-   A separate protected prompt template now exists for local model Dynamic Memory manager behavior, and local providers are routed to it automatically
-   Developer-mode memory logs can now expose raw Dynamic Memory step payloads for debugging malformed local model outputs

Fixes & Stability

-   Character settings now correctly persist group chat conversation and roleplay prompt-template selections
-   Group chat creation and setup pages now scroll correctly on smaller or high-scale desktop displays because the nested flex layout no longer traps the viewport
-   Group chat header top padding was corrected to avoid a double-offset layout issue
-   Settings persistence and frontend settings parsing now salvage valid provider and model state more defensively when individual rows are malformed
-   `llama.cpp` local tool parsing no longer fails early just because a template lacks native parser metadata
-   Local malformed tool argument formats such as parameter-tag wrappers are normalized before they reach Dynamic Memory validation
-   Rust panic handling now writes separate panic logs with backtraces for easier post-crash diagnosis

Dynamic Memory & Local AI

-   Local Dynamic Memory can now run in recursive tool loops rather than a single pass, which helps weaker local models that prefer iterative tool usage
-   Recursive loop execution now emits clearer lifecycle logs covering configuration, per-iteration progress, and stop reasons
-   Raw Dynamic Memory tool-call payloads and per-step responses can be retained for developer inspection
-   Revert now restores memory summaries and related derived state instead of only removing memory entries
-   Revert UI behavior in memory activity logs was refined to make cycle rollback clearer and safer
-   Local memory-manager prompt infrastructure was split so local models can use their own protected template without changing non-local providers

Diagnostics & Logging

-   Dedicated panic report files are now generated for Rust panics with timestamp, thread, payload, location, and backtrace data
-   Dynamic Memory logging now includes raw tool-call capture and recursive-loop execution tracing
-   Settings read and write logging became much more explicit, including provider and model counts plus transaction rewrite details, which helped diagnose provider configuration failures

Desktop-Specific UX Fixes

-   The group chat creation flow now uses a proper nested scroll container instead of behaving like a second full-screen document inside the app shell
-   Character prompt override selections for group chat no longer appear to save and then silently reset when the edit page is reopened

Notable Technical Themes

-   Dynamic Memory shifted from a mostly single-pass workflow toward a more instrumented, iterative, and locally debuggable execution model
-   Local-model tool compatibility work focused on being more permissive with malformed or partially supported outputs instead of failing fast
-   Several fixes in this range were release-hardening patches driven by real-world failures rather than net-new product surface

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.5.0 / 1.2.0

April 13, 2026

## Prompt Template Upgrades, Guided Onboarding & Sync Reliability

This update expands prompt-template control, improves Dynamic Memory and local runtime reliability, adds a guided onboarding system, and hardens sync, backup, and state consistency across the app.

Prompt templates are now typed and validated through a new backend-driven parameter engine

Group chats gained editable prompt templates plus character-specific conversation and roleplay overrides

Dynamic Memory is more resilient when structured output fails, with safer cancellation and stronger validation behavior for reasoning-capable models

First-run onboarding has been replaced with a proper guided tour system across setup and early chat flows

Sync, backup, and reload behavior were hardened to preserve newer schema fields, media references, and memory progress more reliably

Turkish and Simplified Chinese are now supported, alongside broader localization coverage across the app

Prompts & Group Chat

-   Added typed prompt templates with backend-driven validation
-   Added a new backend parameter engine for prompt templates
-   Added clearer required-variable and allowed-image-slot handling for templates
-   Added editable group chat prompt templates
-   Added character-specific prompt overrides for group chats
-   Added support for separate group conversation and roleplay prompt overrides per character
-   Improved prompt and template compatibility across the app
-   Fixed protected prompt template type handling
-   Improved the prompt template editor empty state

Dynamic Memory

-   Added a configurable structured fallback format setting
-   Improved Dynamic Memory fallback behavior when ideal structured output fails
-   Added better cancellation handling for active Dynamic Memory requests
-   Fixed stale Dynamic Memory runs continuing after cancel
-   Stripped reasoning and thinking tags before summary validation
-   Improved Dynamic Memory validation reliability for reasoning-capable models
-   Preserved Dynamic Memory state more reliably on session saves
-   Improved group memory update safety by reloading latest state before applying updates

Local AI, Providers & Runtime Stability

-   Improved llama.cpp tool-call diagnostics
-   Added XML fallback parsing for malformed local tool-call outputs
-   Added raw-output recovery for local tool calls
-   Fixed non-streamed llama.cpp tool calls using the wrong parsing path
-   Improved local tool-call reliability for weaker or imperfect outputs
-   Improved Ollama whitespace handling in streamed reasoning output
-   Improved Ollama whitespace handling in native streaming deltas
-   Added proper abort support for non-streaming Ollama requests
-   Improved Gemini chat handling and overall stability
-   Fixed Gemini thinking and reasoning controls to better match model families
-   Improved cross-provider chat-state stability and fallback handling

Onboarding, Sync & Data Integrity

-   Replaced the old first-run tooltip with a proper guided tour system
-   Added guided onboarding for first run, chat detail, and post-first-message flows
-   Added a long-press hint step to the post-first-message tour
-   Added a way to reset guided tours for retesting or reuse
-   Improved the local GGUF model setup flow
-   Prevented onboarding from continuing with empty model drafts
-   Skipped guided tours after backup restore so restored installs are not treated like fresh installs
-   Improved sync and backup compatibility with the current storage schema
-   Fixed newer fields not being preserved correctly across sync, export, and import
-   Improved preservation of character design metadata in backup and sync
-   Improved preservation of group chat prompt override references in backup and sync
-   Improved preservation of session background paths and memory progress state
-   Improved preservation of group-session memory progress state
-   Expanded sync asset collection for additional referenced media
-   Improved handling of session backgrounds, design reference images, and lorebook avatar references during sync
-   Fixed prompt template entry export handling in backup logic

Reliability, Localization & Polish

-   Fixed regenerated chat variants getting out of sync after refresh
-   Improved session-state consistency after reloads
-   Improved memory-state consistency after saves and updates
-   Reduced cases where stale state could overwrite newer chat or memory data
-   Improved overall reliability across chat, group chat, prompt resolution, and memory flows
-   Added support for query-based API key requirements
-   Improved provider configuration behavior in onboarding and settings
-   Better handled provider capability differences in settings flows
-   Added Turkish language support
-   Added Simplified Chinese language support
-   Added locale icons for Turkish and Simplified Chinese
-   Added many missing translation keys across existing locales
-   Improved localization coverage across onboarding, prompts, settings, and other UI flows
-   Improved onboarding clarity and first-use guidance
-   Improved prompt editing UX
-   Improved debugging and recovery behavior around model and tool failures
-   Added general polish across chat, memory, onboarding, and provider flows

Changes

-   Removed device TTS integration
-   Reduced unnecessary dependency and build surface in some runtime paths

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.4.1 / 1.1.1

April 6, 2026

## macOS Titlebar Fixes, Onboarding Cleanup & Expanded Logs

A small follow-up update focused on desktop polish, local model onboarding consistency, and better logging visibility.

Fixes

-   macOS title is visible again
-   Local model onboarding was reworked to match the built-in `llama.cpp` flow instead of creating a fake onboarding provider
-   Exported logs now include SQLite activity and pool status
-   Mobile-only cleanup removed the local LLM button from onboarding and is not a desktop-facing change

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.4.0 / 1.1.0

April 6, 2026

## Local AI Expansion, Dynamic Memory Upgrades & Desktop UX Overhaul

This release is a major update focused on local AI, Dynamic Memory, desktop UX, and reliability.

Native **Ollama** integration joins a much more capable built-in `llama.cpp` runtime with smarter GPU offload, CPU-safe fallbacks, and better local tool calling

Dynamic Memory now has clearer progress, stronger safeguards, revert support, and more reliable recovery from stale runs

Added runtime fallback reports for local inference

Improved CPU fallback context and batch clamping

Prompt caching, chat settings, and session controls were expanded with cache-aware routing, better live-session sync, and a redesigned settings flow

Desktop gets custom frameless window chrome, wider TopNav adoption, stronger theme-token coverage, and a redesigned Logs page

Models, onboarding, networking, and platform support all received another broad pass for setup quality and runtime reliability

Local AI & Runtime

-   Added native Ollama integration and support for native Ollama tool call payloads
-   Updated `llama.cpp` and related bindings, with better local stability and runtime behavior
-   Reused loaded local models across concurrent requests to reduce unnecessary reloads
-   Added runtime fallback reports for local inference
-   Improved CPU fallback context and batch clamping, plus CPU-safe auto-context behavior on CPU runtimes
-   Fixed several CPU-only safety issues for local inference
-   Added smart GPU layer offload and strict mode overrides
-   Added GPU layer split visibility, sampler order presets, and additional reasoning settings in the model editor
-   Improved model load progress reporting and embedded template rendering via `oaicompat`
-   Improved fallback behavior across local chat templating and rendering paths
-   Hardened and stabilized local tool calling, including structured-output failure handling
-   Shared local backend usage between runtime and context-info paths

Dynamic Memory & Prompting

-   Added a progress bar for Dynamic Memory cycles
-   Allowed cancelling stale non-idle memory runs
-   Reconciled stale processing state on load
-   Improved Dynamic Memory UI state handling
-   Switched memory fallback protocol from JSON to XML
-   Hardened local tool fallback and repair logging
-   Added a llama sampler overwrite toggle for Dynamic Memory and increased its overwrite temperature
-   Hardened deletion safeguards and preset behavior
-   Added deleted memory text to tool logs
-   Added revert support for memory activity cycles
-   Preserved Dynamic Memory settings during backup restore
-   Added prompt cache TTL and sticky routing
-   Completed prompt caching support
-   Improved cache-aware usage and pricing tracking
-   Added a local RP default prompt template
-   Improved prompt and template compatibility with embedded GGUF templates
-   Supported USC system prompt imports

Chat, Scenes & Roleplay Tools

-   Added a chat settings drawer
-   Redesigned session advanced settings
-   Simplified session advanced settings in some flows
-   Restored footer focus correctly after closing the drawer
-   Synced edited messages back into the live session cache after failures and cancels
-   Added support for combined `_**bold italic**_` markdown emphasis
-   Ignored inline image tags when scene generation is unavailable
-   Added support for using chat background as a scene reference
-   Added a session-specific chat background picker
-   Added lorebook keyword detection modes and migration support for the new detection mode
-   Fixed lorebook query column alignment and schema issues
-   Improved unicode thinking parsing related to lorebooks

Desktop UX, Models & Visibility

-   Added a runability score for `llama.cpp` models and redesigned the model editor for better horizontal space usage
-   Kept the model editor on the same page after saving
-   Added a local LLM setup flow to onboarding
-   Improved recommended model installation flow
-   Fixed linking `mmproj` downloads before model creation and auto-creation of recommended installs from queue metadata
-   Unified model selector bottom menus
-   Redesigned the Logs page
-   Added full DB operation logging
-   Added `Ctrl+Shift+L` shortcut for logs
-   Hardened logs against errors
-   Improved mobile overflow and desktop layout behavior in logs
-   Added copy line(s) to the logs context menu
-   Normalized chat debug events for better parser compatibility
-   Added better message, load, and runtime visibility across the app
-   Added custom frameless titlebar and window decorations
-   Added window controls and drag regions to more pages
-   Migrated discovery pages to TopNav
-   Added window controls to chat sub-pages
-   Eliminated empty-state flashes during navigation
-   Connected more chat, sheet, and group chat surfaces to theme color tokens
-   Added About page GitHub icon support
-   Centralized toggle-switch UI work was introduced during this cycle
-   Added session background selection from chat UI

Platform, Networking & Reliability

-   Added LAN OpenAI gateway and Lettuce Host provider
-   Renamed LAN Host API to API Server
-   Deferred timed-out OpenRouter pricing refreshes instead of failing inline
-   Added iOS ONNX Runtime installer workflow
-   Improved Windows DXGI adapter and video-memory checks
-   Improved Windows Vulkan VRAM estimation clamps
-   Repaired macOS ONNX dylib acceptance
-   Fixed valid macOS title bar style enum usage for Tauri
-   Updated dependencies and runtime libraries
-   Expanded supported thinking tag variants
-   Normalized thinking tags across API and local responses
-   Improved stacked toast behavior
-   Routed persona flows through the library
-   Removed the legacy Personas page

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.3.3 / 1.0.3

March 29, 2026

## Sync Reliability, Provider Streaming Controls & Timeout Consistency

This update fixes a sync regression affecting some devices, adds per-provider streaming controls, and standardizes API request timeouts across the app.

Fixes

-   Fixed a sync issue where some devices failed to apply data from other devices due to replaying stale local sync payloads from older app versions
-   After updating, the app rebuilds its local sync state once and continues syncing using the current data format

Changes

-   Providers now have independent streaming toggles
-   Streaming can be enabled or disabled per officially supported provider
-   Features that require non-streaming, such as dynamic memory flows, continue to enforce it where needed

Improvements

-   Normalized API request timeouts to **30 minutes** across the app
-   This removes inconsistent timeout behavior between chat, memory, creation, group chat, and transport flows
-   Improves reliability for long-running requests

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.3.2 / 1.0.2

March 27, 2026

## Security Fixes, Image Generation Providers & System Prompt Tools

This update fixes two medium-severity security issues and adds new image generation integrations, prompt tooling, and a set of quality-of-life improvements across Android and Desktop.

Security

-   Fixed a backup import path traversal issue that could allow arbitrary file writes
-   Fixed a local media path traversal issue that could allow unintended file reads, writes, or deletions

New

-   Added AUTOMATIC1111 and Stability AI support for image generation
-   Added an update checker
-   Added Injection Rules for System Prompts
-   Added inline code text coloring
-   Added the ability to delete images
-   Added Scene Generation modes: `manual`, `ask first`, and `automatic`
-   Added an About App page in Settings
-   Added Debug Mode
-   Added a Reddit button

Improvements

-   Redesigned the Image Generation Settings page
-   Redesigned the System Prompts entry editor

Fixes

-   AI reference drafts now inherit model settings
-   Fixed an issue where the UI could get stuck if generation was canceled mid-process
-   Character import now accepts `.uec` files again

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.3.1 / 1.0.1

March 23, 2026

## Stability Fixes, Scene Writing Options & PNG Character Cards

This release focuses on bug fixes, small quality-of-life upgrades, and a few targeted additions for scene writing, reference text generation, and character card compatibility.

Added Scene Description Writer and Reference Text Writer LLM options

Added support for PNG-based Character Cards

Fixed message images, scene toggles, sync loading, and text color application issues

Improved unsaved-changes toast behavior and chat appearance preview consistency on mobile

Fixes

-   Fixed memories resetting when pressing Enter during editing
-   Bundled feedback sounds directly into the app binaries
-   Fixed the Sync page so it loads correctly again
-   Fixed Character Creation reset behavior after creation
-   Fixed regenerated images so they display correctly inside messages
-   Fixed scene generation so it respects the disable toggle
-   Fixed text color application on the Colors settings page

Improvements

-   Unsaved changes toasts now stay dismissed until the next leave attempt
-   The Chat Appearance preview now uses the same overlay as scene editing on mobile

New

-   Added a Scene Description Writer LLM option for better scene writing
-   Added a Reference Text Writer LLM option
-   Added support for PNG-based Character Cards

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.3.0 / release

March 22, 2026

## Images 2.0, Built-In Local AI, Sync 2.0 & Full Chat Customization

This release overhauls LettuceAI's image system, expands the local model ecosystem with built-in llama.cpp and a hardware-aware HuggingFace browser, rewrites sync, and ships a broad set of chat, UI, performance, and architecture upgrades across Android and Desktop.

Images 2.0 redesign with scene-based generation, reusable library assets, and avatar editing

Built-in llama.cpp runtime with GPU support, image generation, and native tool calling

New HuggingFace GGUF browser with hardware-aware compatibility estimates

Large chat upgrade covering group chats, templates, memory, and streaming reliability

Full sync rewrite, deeper UI customization, and a major internal architecture refactor

Images 2.0

-   Added full image generation support for chat and scenes
-   Introduced Image Language so any LLM can trigger image generation by writing scene prompts after responses
-   Unified all avatars, backgrounds, and generated images inside one reusable library
-   Added avatar generation and avatar editing with image models
-   Added reusable reference images and text for Characters and Personas during scene generation

Local AI & Model Ecosystem

-   Added a built-in llama.cpp runtime with support for NVIDIA, AMD, Intel GPUs, and Apple Silicon
-   Expanded runtime customization for local inference workflows
-   Added image generation and tool calling support to the local runtime
-   Added a HuggingFace model browser for searching and exploring GGUF models
-   Introduced hardware-aware compatibility checks with estimates for context length, quantization, and KV cache usage
-   Added a detailed scoring breakdown for model recommendations

Chat & Roleplay System

-   Increased the default max output token limit to 2048
-   Added proper memory rewind when branching chats and scene editing per session
-   Improved streaming stability, abort handling, and multimodal attachment reliability
-   Reworked group chats with configurable speaker selection modes for LLM, heuristic, and round-robin turn management
-   Added per-character mute, lorebooks, pinned messages, and typing haptics in group chats
-   Added reusable chat templates for preconfigured single-chat setups
-   Improved Dynamic Memory with missing-tag repair, cancelable memory cycles, and a no-tool-calling mode for unsupported models

UI / UX & Customization

-   Added a full chat appearance system with controls for font size, text colors, card colors, background blur, and more
-   Added multiple appearance presets for faster setup
-   Redesigned chat history, persona editor, character editor, and model editor
-   Added grid view support in the model browser and persona nicknames
-   Added full multi-language support with auto-detection and a language selector

Sync, Storage & Data

-   Rebuilt sync to compare client state and transfer only missing or outdated data instead of sending everything
-   Reduced bandwidth use and improved sync reliability with the new diff-based flow
-   Added chat package import and export support
-   Added SillyTavern `.jsonl` import support
-   Unified export flows for lorebooks, system prompts, and model configurations

Platform & Performance

-   Experimental iOS and macOS support is now available, though some features remain unstable
-   Optimized Android ONNX Runtime packaging and added a crash logger for fallback logging coverage
-   Refactored image delivery to use the Tauri Asset Protocol instead of IPC, reducing memory use and lag
-   Reduced UI jank in image-heavy flows and the HuggingFace browser
-   Improved lazy loading, rendering performance, and GPU fallback stability

Internal Architecture

-   Modularized the chat system into execution, memory, scene generation, and reply-helper layers
-   Added typed internal persistence and removed legacy command hops
-   Reorganized the app around feature-based module grouping with cleaner bootstrap boundaries

Fixes & Stability

-   Fixed provider credential routing issues
-   Fixed chat resend and duplicate-message behavior
-   Fixed scene and lorebook import bugs
-   Improved accessibility contrast and layout behavior
-   Improved mobile keyboard handling
-   Improved crash logging reliability

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

1.2.0 / Beta 4

February 15, 2026

## Desktop UX Overhaul, Prompt Runtime Controls, Dynamic Memory Upgrades & ONNX Reliability

Major investment in desktop UX and character creation flow, a significantly expanded prompt system, broad chat stability and performance hardening, and continued ONNX runtime reliability work across desktop, Android, and Windows.

Major investment in desktop UX, especially the Create Character flow

Significant expansion of the prompt system with a Prompt Structure Viewer and runtime prompt injection controls

Broad chat stability and performance hardening with large dynamic-memory improvements

Continued embedding and ONNX runtime reliability work across desktop, Android, and Windows

Expanded provider and model ecosystem, including NVIDIA NIM, plus safer import behavior

Desktop UI and Character Creation Redesign

-   Added responsive desktop layouts across character creation steps
-   Reworked character creation step order to improve setup flow
-   Redesigned character extras inputs for clearer, faster editing
-   Improved character create/edit with fallback-model selector support
-   Fixed create-step ordering and navigation consistency issues
-   Added metadata handling improvements for imported cards and avatar URL behavior
-   Added lorebook import support in character creation workflows

Prompt System Upgrades

-   Added Prompt Structure Viewer in the system prompt editor to preview message composition
-   Added conditional prompt injection mode
-   Added interval prompt injection mode
-   Added runtime option to condense prompts into a single system message
-   Fixed prompt import behavior to correctly respect `prompt_order`
-   Fixed drag-and-drop reorder bugs in the prompt entry editor
-   Improved prompt import UX and editor predictability

Chat, Group Chat, and UI

-   Added shared ChatLayout for persistent background behavior across chat sub-routes
-   Added shared GroupChatLayout with lifted data loading
-   Added branch-to-group-chat action from message actions
-   Added lorebook usage visibility per message
-   Added safe-area padding fixes for chat footer and bottom menu
-   Removed unwanted dark overlay above background images
-   Fixed chat search back-button sizing and related UI polish
-   Improved session back-stack handling across settings/history navigation
-   Fixed persona selection conflicts during scroll interactions
-   Removed duplicate dismiss controls in chat memories error state

Chat Stability and Performance

-   Fixed dynamic-memory listener leak during async chat setup
-   Bounded attachment cache growth in session hooks
-   Ignored stale attachment loads after chat state transitions
-   Fixed cleanup of jump-to-message RAF and timeout resources
-   Improved message memo checks with derived display props
-   Reduced attachment diff cost in chat memoization
-   Added fallback model retry logic with usage attribution
-   Disabled fallback attempts when no fallback model is configured
-   Added swap-places mode with role-aware generation
-   Reverted one streaming animation performance change after validation feedback

Dynamic Memory System

-   Added cursor-delta summarization of new messages
-   Added self-healing cursor behavior after deletes/rewinds
-   Added deduplication by cosine similarity at memory creation
-   Added adaptive decay rate based on access count
-   Added category tagging for memories
-   Added hybrid retrieval using similarity, recency, and access frequency
-   Added configurable retrieval selection limit
-   Added smart and cosine retrieval strategies
-   Added memory panel category filter chips
-   Added memory activity log redesign with timeline and collapsible UX
-   Auto-refresh of memory views after dynamic-memory completion
-   Enforced gating behavior for dynamic-memory manual mode

Embeddings, ONNX Runtime, and Android

-   Fixed ONNX runtime bundling and dylib path handling
-   Fixed dev rebuild-loop behavior tied to ONNX runtime integration
-   Pinned and standardized dylib preloading and path behavior
-   Ensured Android ONNX resource directory and packaging consistency
-   Improved desktop guards around ONNX runtime initialization
-   Improved handling of ORT init result variants and booleans
-   Added pre-step for embedding download and runtime ORT fetch
-   Extracted Windows DLL dependencies for ONNX runtime packaging
-   Locked ORT version to 2.0.0-rc.10
-   Added embedding model v3 support and multi-version management
-   Added experimental keep-loaded embedding runtime with cache reset on version switch
-   Fixed Android post-regenerate WebView freeze and tracing consistency
-   Made Android ONNX runtime init deterministic

Providers, Models, Endpoints, and Security

-   Added NVIDIA NIM provider
-   Added custom-provider tool-choice mode configurability
-   Added OpenRouter free-model toggle in model selector
-   Improved model selector search and suggestions
-   Added custom endpoint config persistence and auth/model-fetch mapping controls
-   Hid llama.cpp provider on mobile onboarding/settings where unsupported
-   Added security toggle to disable remote avatar downloads on card import
-   Disabled Chutes API key validation where it blocked onboarding flows

Lorebooks, Usage, Sync, and Tooling

-   Added world-info import/export and creation import action
-   Added character card metadata support and lorebook import path improvements
-   Added new pure mode content filtering system
-   Added app-time tracking backend support and analytics view
-   Enforced host-authoritative manifest diff in sync logic
-   Improved DB reset error surfacing and reset-in-place behavior
-   Migrated workflows to Blacksmith
-   Switched workflows to Bun and refreshed README/tooling docs
-   Added libclang dependency for Windows CI builds
-   Removed duplicate Cargo libraries and cleaned project config
-   Added `.gitignore` updates and docs-folder ignore adjustments
-   Fixed Tailwind warning noise in UI build paths

[View full release on GitHub →](https://github.com/LettuceAI/app/releases/tag/1.2.0)

Android · Desktop

1.1.0 / Beta 3

January 31, 2026

## Discovery, Group Chats, Smart Creator, Prompt Editor & Local Inference

This update brings Discovery, multi-character chats, a redesigned Smart Creator, and deeper local inference controls across Android and Desktop. It also includes a broad set of UI, stability, and workflow refinements shipped through January 31, 2026.

Discovery

-   A brand-new Discovery system powered by Character Tavern. Browse trending, popular, and newest cards or search directly, preview details before importing, and keep Pure Mode enabled to automatically filter NSFW results (with blurred avatars until you add a character).

Group Chats

-   A brand-new chat mode that lets multiple characters share one conversation. The app selects the next speaker automatically (or you can @mention to force a character), and roleplay groups can start with custom scenes. Long sessions are more stable with improved abort handling and streaming fixes.

Smart Creator

-   Smart Creator now supports Characters, Personas, and Lorebooks with a new goal selector and preview modes
-   Streaming responses and inline previews during creation
-   Smart Tool Selection toggle added, with manual tool presets and per-tool control in Advanced Settings
-   Image generation support with model selection in Advanced Settings
-   Smart Creator previews for Personas and Lorebooks

Help Me Reply

-   Help Me Reply now supports streaming, conversation/roleplay styles, and max token settings
-   Help Me Reply settings now allow per-feature model selection

Prompting System

-   Prompt Editor redesigned to be entry-based, with auto-scroll and mobile renaming
-   Per-entry roles and injection controls (including in-chat entries) for modular templates
-   System Prompt presets can be imported and exported
-   System Prompts UI redesigned and model-level prompts removed
-   Added `{{user}}` placeholder support and updated scene directions

Import & Export

-   Unified Entity Card (UEC) import/export support
-   Chara Card v1, v2, and v3 import support
-   Export characters as UEC, Chara Card v2 and v3
-   Personas can now be exported from the Library

Local Inference

-   Built-in llama.cpp runtime for desktop builds
-   Ollama now uses native endpoints
-   Automatic context length recommendations to prevent hardware crashes
-   Toggle to merge same-role messages for Ollama/llama.cpp compatibility
-   Advanced settings for local inference and support for <think> tags
-   CUDA support attempted for llama.cpp (currently disabled)

UI, UX & Stability

-   Redesigned Advanced Settings and Dynamic Memory pages
-   Creation menu refreshed and full-screen scene editor added
-   Improved persona selector and chat settings model selector menu
-   Long-press reordering for Lorebook and System Prompt entries on mobile
-   Redesigned toasts with unsaved changes protection (sticky + mobile bottom)
-   Fixed avatar display inconsistencies and Usage page text overflow
-   Bottom navigation simplified with larger icons and hidden labels
-   Dynamic Memory now works correctly after 120+ messages, with fixed counters
-   Fixed Mistral reasoning parameter handling and custom endpoint base URL display
-   Cost calculation fixes with a new recalc option in Advanced Settings
-   ONNX Runtime downgraded for broader device compatibility
-   Logging improved with a diagnostics section and global error integration
-   Embedding model load now has additional fail-safes

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Android · Desktop

Release / Beta 2

January 4, 2026

## Text-to-Speech, AI Character Creator, Reply Helper, Sync, Accessibility Upgrades & Voice Playback

This release brings LettuceAI to Android along with the second desktop beta update. It introduces Text-to-Speech voices, reply generation assistance, encrypted device-to-device sync, enhanced accessibility features, and per-character voice playback controls. These updates focus on expressiveness, comfort, and smoother roleplay workflows.

AI Character Creator

-   Conversational guided character creation
-   Automatic field filling (name, traits, description, etc.)
-   Optional starting scenes to define tone
-   Attach avatars & reference material
-   You can stop at any time, everything remains editable in the manual editor
-   The Creator uses your default app model

Text-to-Speech Voices

-   **Device TTS** – uses your system's built-in voice engine
-   **ElevenLabs** – natural voice synthesis with custom voice support
-   **Gemini TTS** – neural speech generation with custom voice support
-   You can also create custom voices with style descriptions and reuse them across characters
-   Generated audio is cached locally to reduce repeated regenerations

Reply Helper

-   **Use my text as base** — improve or complete your draft
-   **Write something new** — generate a fresh reply
-   **Regenerate** — try multiple suggestions
-   Reply Helper uses your default app model

Encrypted Device Sync

-   Peer-to-peer encrypted transfer
-   No servers or permanent connections
-   You start sync manually when needed
-   One device hosts a session, the other joins with a code. Once connected, your data is synced directly between devices

Accessibility Improvements

-   Per-event volume controls
-   Optional haptic feedback with selectable intensity
-   Lightweight and non-intrusive

Per-Message Voice Playback

-   Assign a default voice per character
-   Optional autoplay
-   Manual playback button per message

Scene Directions

-   Scenes now support private "direction" notes that are hidden from the chat UI and used only to guide model behaviour during the opening context of a scene

General Improvements

-   Improved character editing workflow
-   Better consistency across Android & Desktop
-   Internal cleanup & UI polish

Bug Fixes & Behaviour Improvements

-   Reasoning now works correctly with the Google Gemini endpoint
-   Fixed an issue where Dynamic Memory processing could cancel when switching pages
-   Fixed an issue where characters could be duplicated unexpectedly
-   Added a retry button to the embedding download screen
-   Fixed Backup settings failing to load existing backups
-   Redesigned the Edit Model page into a single-page layout
-   Disabled reasoning controls for the Mistral endpoint
-   Optimised entry animations in Settings
-   Optimised Markdown rendering performance
-   Added support for `(...)` and `[...]` as italic formatting shortcuts
-   Added Scene Directions to help guide starting scene behaviour

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Desktop

v1.0-beta.6.2

December 24, 2025

## Backup Fixes, Provider Expansion & Extended Timeout

Beta 6.2 is a stability and compatibility update focused on fixing critical backup issues, expanding provider support with Ollama and LM Studio, and improving reasoning model compatibility.

Bug Fixes

-   **Fixed backup issues** where data wasn't fully saved
-   **Fixed characters losing context** after restore
-   **Fixed OpenRouter & MistralAI reasoning** to work correctly with reasoning-capable models
-   **Fixed backups with images** not loading properly

New Features

-   **Added Ollama & LM Studio endpoint support** for locally hosted models
-   **Added custom OpenAI / Anthropic-compatible endpoints** for flexible API integration
-   **Increased request timeout** from 2 minutes to 15 minutes for better handling of slow models and reasoning tasks

[View full release on GitHub →](https://github.com/LettuceAI/app/releases)

Desktop

v1.0-beta.6

December 21, 2025

## Dynamic Memory v2, Lorebooks, In-Chat Image Generation & Major Performance Improvements

Beta 6 is a major systems and UX update focused on memory accuracy, world consistency, creative flexibility, and performance. It's designed to make long conversations faster, more coherent, and easier to control, while expanding what's possible inside a single chat.

Dynamic Memory v2

-   Dynamic Memory has been significantly upgraded with faster, more responsive memory handling, higher recall accuracy, improved behavior in long-running chats, and better stability across multiple memory cycles
-   Dynamic Memory v2 is designed to scale cleanly as conversations grow

New Embedding Model

-   A new embedding model now powers memory retrieval in Beta 6. It is approximately 50% smaller than the previous model, runs faster during inference, and supports up to 4096 tokens (previously 512)
-   Existing memories remain compatible. No migration required

Context Enrichment (Experimental)

-   An experimental Context Enrichment feature has been introduced. It enhances memory queries using the new embedding model, improves recall accuracy in follow-up messages, and reduces ambiguity during semantic search
-   This feature is currently experimental and may evolve in future releases

Lorebooks

-   Lorebooks introduce a structured way to inject world, character, and knowledge information into chats. Define locations, factions, rules, history, and concepts. Lore entries are automatically injected when relevant and treated as established canon
-   Lorebooks improve consistency across scenes and long roleplay sessions while staying separate from character memory

In-Chat Image Generation

-   Images can now be generated directly inside conversations. This is supported for models that expose image generation capabilities, enabling visual storytelling and richer creative workflows directly within the chat flow

Model & API Improvements

-   Added support for the **Chutes API endpoint**
-   Introduced an **OpenAI-compatible API endpoint** with extensive customization including custom user/assistant role names and flexible chat completion behavior
-   Added **Reasoning support** for models that expose reasoning tokens

Chat & Workflow Improvements

-   **Rewind to Here:** Resume conversations from any previous user message. Explore alternate paths without losing history
-   **Redesigned Chat Settings:** A new Chat Settings panel designed based on user feedback and suggestions

UI & Layout Improvements

-   Redesigned Character Cards for better clarity and hierarchy
-   Chat Header memory button now shows memory status and usage
-   Improved consistency across chat, settings, and character screens
-   Refined spacing, typography, and interaction feedback
-   Reduced visual noise in frequently used views
-   Redesigned chat history layout for readability

Desktop Builds

-   LettuceAI continues to be available as beta desktop builds alongside the mobile app
-   **Windows:** .msi installer, .exe portable build
-   **Linux:** .AppImage, .deb, .rpm
-   Desktop builds are still considered beta while platform-specific issues are being refined. Functionality generally matches the mobile app unless otherwise noted

Performance Improvements

-   Long chats now load up to **~8x faster**
-   Character list on the homepage loads faster and scrolls more smoothly
-   Improved internal state handling and caching logic
-   Backup system robustness significantly improved

Bug Fixes

-   Fixed an issue where Dynamic Memory could get stuck after cycle 2
-   Fixed an app freeze caused by corrupted or invalid backup files
-   Fixed an incorrect Google API endpoint URL

Thank You

-   Beta 6 is a foundational release that strengthens LettuceAI's core systems while expanding both creative and technical flexibility. Your feedback continues to shape LettuceAI into a deeply customizable, privacy-first AI companion built for long-term conversations and roleplay.

[View full release on GitHub →](https://github.com/LettuceAI/mobile-app/compare/1.0-beta.5...1.0-beta.6)