Mosaic AI capabilities for generative AI apps
Mosaic AI includes a suite of tools that addresses the challenges of building high quality, production gen AI apps. This code-first set of tools provides functionality for building, evaluating, deploying, and monitoring gen AI apps at scale. Each component integrates with Unity Catalog for unified governance and with MLflow for experiment tracking, tracing, and versioning.
Overview of Mosaic AI components
Model Serving helps you:
- Deploy and query gen AI applications and models using a unified interface
- Govern and monitor deployments
Agent Evaluation helps you:
- Quickly iterate to high quality in your development loop.
- Monitor and fix quality during production.
- Collect input and feedback from subject matter experts on gen AI app quality.
- Manage evaluation datasets to define and measure quality.
Agent Framework enables you to:
- Deploy your gen AI app code + config - logged as MLflow models - to production-ready serving endpoints with a single line of code.
MLflow provides capabilities to:
- Instrument your gen AI apps to provide observability and telemetry.
- Log your gen AI app’s code and configuration to manage their lifecycle.
- Author agents using a wide variety of frameworks including langchain, langgraph, crewAI, OpenAI SDK.
Enable data intelligence
Databricks Data Intelligence platform learns about your data and uses that insight to assist you, while still providing strong governance and security:
- Vector Search: Automatically index your knowledge base at scale for similarity or hybrid search.
- Genie: Use natural language to query your structured data.
- Serverless SQL: Integrate existing data sources into your gen AI app for analytics or transformations.
- Online Tables: Access real-time features within your gen AI app.
How Mosaic AI enables gen AI app development
Mosaic AI helps you solve some of the challenges of gen AI app development.
End-to-end governance across data and AI
Tight integration with Unity Catalog to provide a single source of truth for data and AI governance:
- Unity Catalog Functions: Governance for your agent system’s SQL and Python-based tools.
- Unity Catalog Models: Governance for your agent system’s code and config.
- Unity Catalog Connections: Governance for your internal and external APIs used by your agent system.
Unified telemetry and observability across all gen AI Apps deployed on and off Databricks:
- MLflow Tracing allows you to instrument your gen AI apps to collect telemetry and observability data for auditing and quality evaluation/monitoring.
- AI Gateway allows you to track usage and log requests, traces, and user feedback.
Lakeguard provides a sandboxed code execution environment so that your tools respect Unity Catalog governance and ACLs.
Production-quality deliverables
Easy to collaborate with subject matter experts (SMEs) to collect input to inform the definition of quality.
- Agent Evaluation provides built-in UIs to allow subject matter experts to label evaluation/training data and provide feedback on gen AI app outputs.
- Monitoring UIs help analyze SME interactions and transform valuable feedback into structured evaluation data for ongoing improvement.
Accurate, fast, quality measurement in development and production, for gen AI apps deployed both on and outside of Databricks.
- With Agent Evaluation:
- Evaluation sets are used to define and measure the standard of quality.
- AI judges measure quality and identify the root causes of quality issues.
- Agent monitoring to automatically assess the quality of production deployments using LLM judges.
- Monitoring UI to identify and debug quality issues during production.
Rapid development tools that reduce development time
Agent Evaluation
- Synthetic evaluation set generation provides high-quality evaluation data to test and evaluate gen AI app quality before engaging SMEs.
- Evaluation functionality facilitates quickly assessing quality using the LLM judges and evaluation datasets.
Agent Framework: Deploy your gen AI app’s code & config, logged as MLflow models, to production-ready serving APIs hosted on model serving using one line of code.
AI Playground: Sandboxed UI to interact with a deployed app.
See more