Skip to main content

⁉️ Frequently Asked Questions

How much VRAM a LLM model consumes?

By default, Tabby operates in int8 mode with CUDA, requiring approximately 8GB of VRAM for CodeLlama-7B.

For ROCm the actual limits are currently largely untested, but the same CodeLlama-7B seems to use 8GB of VRAM as well on a AMD Radeonβ„’ RX 7900 XTX according to the ROCm monitoring tools.

What GPUs are required for reduced-precision inference (e.g int8)?
  • int8: Compute Capability >= 7.0 or Compute Capability 6.1
  • float16: Compute Capability >= 7.0
  • bfloat16: Compute Capability >= 8.0

To determine the mapping between the GPU card type and its compute capability, please visit this page

How to utilize multiple NVIDIA GPUs?

Tabby only supports the use of a single GPU. To utilize multiple GPUs, you can initiate multiple Tabby instances and set CUDA_VISIBLE_DEVICES (for cuda) or HIP_VISIBLE_DEVICES (for rocm) accordingly.

My AMD device isn't supported by ROCm

You can use the HSA_OVERRIDE_GFX_VERSION variable if there is a similar GPU that is supported by ROCm you can set it to that.

For example for RDNA2 you can set it to 10.3.0 and to 11.0.0 for RDNA3.

How can I use my own model with Tabby?

Please follow the Tabby Model Specification to create a directory with the specified files. You can then pass the directory path to --model or --chat-model to start Tabby.

Can I use local model with Tabby?

Tabby also supports loading models from a local directory that follow our specifications as outlined in MODEL_SPEC.md.

Is it safe to put Tabby root directory on NFS filesystem?

No, it is not recommended to put Tabby's root directory on an NFS filesystem.

Tabby depends on SQLite for its database operations, and SQLite requires proper file locking mechanisms to ensure data integrity. Many networked filesystems, especially NFS, have broken or missing lock implementations that can lead to database corruption and unexpected behaviors.

For more technical details about this limitation, please refer to the SQLite documentation: Filesystems with broken or missing lock implementations.

How do I enable debug logging in the Tabby server?

The Tabby server utilizes the RUST_LOG environment variable to control logging levels. To enable debug logging, you can set this variable when you start the server.

For example, if you are running Tabby from the command line, you can do so like this:

RUST_LOG=debug tabby serve

If you are using Docker, you can pass the environment variable using the -e flag:

docker run -e RUST_LOG=debug ...