3 posts tagged with "release"

Announcing our $3.2M seed round, and the long-awaited RAG release in Tabby v0.3.0

October 14, 2023 · 2 min read

We are excited to announce that TabbyML has raised a $3.2M seed round to move towards our goal of building an open ecosystem to supercharge developer experience with LLM 🎉🎉🎉.

Why Tabby 🐾 ?

With over 10 years coding experience, we recognize the transformative potential of LLMs in reshaping developer toolchains. While many existing products lean heavily on cloud-based end-to-end solutions, we firmly believe that for AI to be genuinely the core of every developer's toolkit, the next-gen LLM-enhanced developer tools should embrace an open ecosystem. This approach promotes not just flexibility for easy customization, but also fortifies security.

Today, Tabby stands out as the most popular and user-friendly solution to enable coding assistant experience fully owned by users. Looking ahead, we are poised to delve even further into the developer lifecycle, and innovate across the full spectrum. At TabbyML, developers aren't just participants — they are at the heart of the LLM revolution.

Release v0.3.0 - Retrieval Augmented Code Completion 🎁

Tabby also comes to a v0.3.0 release, with the support of retrieval-augmented code completion enabled by default. Enhanced by repo-level retrieval, Tabby gets smarter at your codebase and will quickly reference to a related function / code example from another file in your repository.

A blog series detailing the technical designs of retrieval-augmented code completion will be published soon. Stay tuned!🔔

Example prompt for retrieval-augmented code completion:

// Path: crates/tabby/src/serve/engine.rs
// fn create_llama_engine(model_dir: &ModelDir) -> Box<dyn TextGeneration> {
//     let options = llama_cpp_bindings::LlamaEngineOptionsBuilder::default()
//         .model_path(model_dir.ggml_q8_0_file())
//         .tokenizer_path(model_dir.tokenizer_file())
//         .build()
//         .unwrap();
//
//     Box::new(llama_cpp_bindings::LlamaEngine::create(options))
// }
//
// Path: crates/tabby/src/serve/engine.rs
// create_local_engine(args, &model_dir, &metadata)
//
// Path: crates/tabby/src/serve/health.rs
// args.device.to_string()
//
// Path: crates/tabby/src/serve/mod.rs
// download_model(&args.model, &args.device)
    } else {
        create_llama_engine(model_dir)
    }
}

fn create_ctranslate2_engine(
    args: &crate::serve::ServeArgs,
    model_dir: &ModelDir,
    metadata: &Metadata,
) -> Box<dyn TextGeneration> {
    let device = format!("{}", args.device);
    let options = CTranslate2EngineOptionsBuilder::default()
        .model_path(model_dir.ctranslate2_dir())
        .tokenizer_path(model_dir.tokenizer_file())
        .device(device)
        .model_type(metadata.auto_model.clone())
        .device_indices(args.device_indices.clone())
        .build()
        .

Tabby v0.1.1: Metal inference and StarCoder supports!

September 18, 2023 · 2 min read

Meng Zhang

We are thrilled to announce the release of Tabby v0.1.1 👏🏻.

Staring tabby riding on llama.cpp

Created with SDXL-botw and a twitter post of BigCode

Apple M1/M2 Tabby users can now harness Metal inference support on Apple's M1 and M2 chips by using the --device metal flag, thanks to llama.cpp's awesome metal support.

The Tabby team made a contribution by adding support for the StarCoder series models (1B/3B/7B) in llama.cpp, enabling more appropriate model usage on the edge for completion use cases.

llama_print_timings:        load time =   105.15 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
llama_print_timings: prompt eval time =    25.07 ms /     6 tokens (    4.18 ms per token,   239.36 tokens per second)
llama_print_timings:        eval time =   311.80 ms /    28 runs   (   11.14 ms per token,    89.80 tokens per second)
llama_print_timings:       total time =   340.25 ms

Inference benchmarking with StarCoder-1B on Apple M2 Max now takes approximately 340ms, compared to the previous time of around 1790ms. This represents a roughly 5x speed improvement.

This enhancement leads to a significant inference speed upgrade🚀, for example, It marks a meaningful milestone in Tabby's adoption on Apple devices. Check out our Model Directory to discover LLM models with Metal support! 🎁

tip

Check out latest Tabby updates on Linkedin and Slack community! Our Tabby community is eager for your participation. ❤️

Introducing First Stable Release: v0.0.1

August 31, 2023 · One min read

Meng Zhang

We're thrilled to announce Tabby's first stable version, v0.0.1! 🎉 This marks a significant milestone in our continuous efforts to refine our platform. The Tabby API specification is now officially stable, providing a reliable foundation for future development. 🚀

📦 To enjoy these improvements, simply upgrade your Tabby instance to v0.0.1 using the image tag: tabbyml/tabby:v0.0.1. This ensures access to the latest features and optimizations.

📚 For more details and to access the release, visit our GitHub repository at https://github.com/TabbyML/tabby/releases/tag/v0.0.1.

We deeply appreciate your ongoing support and feedback, which has been instrumental in shaping Tabby's development. 🙏 We're committed to delivering excellence and innovation as we move forward.

Thank you for being part of the Tabby community! ❤️

Why Tabby 🐾 ?​

Release v0.3.0 - Retrieval Augmented Code Completion 🎁​

Why Tabby 🐾 ?

Release v0.3.0 - Retrieval Augmented Code Completion 🎁