Skip to main content

System Status: 

TritonGPT Large Language Models (LLMs)

As commercial generative AI services continue to evolve, users are increasingly familiar with large language models (LLMs) from providers such as OpenAI and others. In response, the TritonGPT platform at UC San Diego now offers access to multiple LLMs, empowering users to choose the model best suited to their specific needs.

To maintain the highest standards of data protection, all models are deployed through secure, institutionally governed infrastructure that ensures compliance, privacy, and robust security controls.

Llama

Meta's Llama, a dialogue-optimized, open-source LLM, which is hosted securely on-premises at the San Diego Supercomputer Center (SDSC) is the default model. 

By hosting the Llama model within SDSC's secure environment, UC San Diego retains full control over data usage and sharing. All processing occurs entirely within the university's infrastructure, ensuring institutional oversight and protection.

OpenAI Models

The open-source OpenAI model (OSS) is hosted entirely on-premises at SDSC. All inference and data processing occur within UC San Diego–managed infrastructure; data never leaves SDSC, is not shared with external entities, and is not used to train or improve any models.

For proprietary models accessed via Microsoft Azure, the Azure OpenAI Service runs within Azure AI Foundry. Data provided to the service remains private and secure, is not shared with OpenAI or any third party, and is never used to train or enhance models. These Azure-hosted models are configured to maintain full U.S. data residency.

Commitment to Privacy and Compliance

By leveraging Azure's secure infrastructure for cloud models and SDSC's on-premises hosting for open-source models, UC San Diego ensures that all data privacy, security, and compliance requirements are fully met.

Model Summaries

Model Description Designed For
Llama Scout A fast, lightweight model hosted within TritonGPT’s on-premises architecture. It provides low-latency responses and supports effecient response to questions.  General-purpose chat, experimentation, and scenarios requiring minimal compute overhead.
GPT-4o (Omni) GPT-4o enables multimodal interaction with strong performance across text, vision, and conversational workflows. Ideal for versatile tasks that benefit from rich dialogue and visual input. Natural language interaction tuned for content generation, image-based reasoning, and broad administrative or creative tasks.
GPT-4.1 GPT-4.1 is optimized for comlex tasks, long-context comprehension, and coding/software development.  Dialogue, code generation, and data-rich problem solving.
GPT OSS A versatile open-source model that supports a a mixture of experts architecture for optimized perfomance.  Advanced reasoning capabilities providing for a high degree of accuracy in a Retrieval Augmented Generation set up.