Run AI in your office, factory network, or air‑gapped environment. Secure RAG, summarization, search, and chat—fully closed.

Many companies use cloud AI like ChatGPT, but face critical barriers in real business operations.
Customer data and business strategies entered in prompts are sent to external servers and may be reused for AI training, risking unintended leakage to third parties.
While strong on general knowledge, cloud AI lacks your internal rules, past incident patterns, and current deal progress—often producing generic or inaccurate responses.
Strict industry regulations and internal policies (e.g., ISMS) often prohibit sending confidential data to external clouds, limiting AI adoption.
Build and operate AI models within your dedicated server environment or fully isolated private cloud.
Data never leaves your network. From prompt history to training data, everything stays under your control, physically eliminating leakage risk.
Connect PDFs, Excel, meeting notes, and manuals directly to AI. Turn it into a dedicated intelligence that deeply understands your business.
No dependency on external API downtime or network latency. Secure facilities and factories with limited internet can run AI reliably.
Pre-configured machines delivered ready for deployment. No complex setup required.
Deploy Llama on one internal server; connect from devices on the same local network.
Server Setup
Install Llama on your physical server and run it as an inference API. Data and processing stay entirely on your premises.
Client Connection
PCs and tablets on the same LAN send requests to the server IP. Use from browsers, business apps, or chat tools via API.
Benefits
Reuse existing internal networks—no extra lines or VPN. Restrict server access to internal traffic via firewall for secure AI use.
RAG turns AI from a search tool into a digital team member that leverages your knowledge base.
"Analyze all negotiation logs with Client C over the past 5 years and derive expected objections and countermeasures from successful patterns."
"From 30-year-old blueprints, spec change history, and maintenance reports, list possible causes of current abnormal vibration in order of likelihood."
"Compare latest regulations with our work rules and past labor notices to determine overseas business trip allowance validity and cite relevant provisions."
On-premise AI with RAG excels in sales strategy formulation.
Integrated Input Data
CRM negotiation notes, 3 years of email history, detailed quotes, competitor comparisons, preference logs per contact.
Example AI Strategic Responses
• "The contact has historically prioritized delivery and support over price."
• "A lost deal 2 years ago was decided by Competitor X's feature. Emphasize our latest update that addresses this."
• "Recent earnings and meeting notes show Client B is focusing on production automation."
coiai delivers fast deployment in as little as 1 week after hardware setup.
Requirements & Data Discovery
Hear about challenges, identify data sources (shared folders, DBs), and select the right AI model.
Environment Build
Deploy AI runtime on high-performance VRAM PCs in a private network. Secure configuration with external connectivity disabled.
RAG & Tuning
Load your data, validate accuracy. Apply prompt engineering and search tuning until ready for production.
Production & Iteration
Train users. Analyze usage logs and expand data as needed to keep AI up to date.
Monthly estimates by scale. Hardware assumed under lease. Actual amounts vary by model, specs, and lease terms.
| Scale | Concurrent Users | Configuration | Monthly (excl. tax) | Initial (excl. tax) |
|---|---|---|---|---|
| Single user | 1 | Mini server or high-end PC (CPU inference or small GPU). Existing devices can be reused. | ¥60,000 | ¥120,000 |
| Small team | 5–15 | GPU server (e.g., NVIDIA RTX/A series). RAG, multi-session. | ¥200,000 | ¥400,000 |
| Enterprise | 20–50+ | High-end GPU or multi-node. Load balancing, HA. | ¥500,000 | ¥1,000,000 |
Initial cost is 2 months of lease. Includes requirements, Llama/API setup, network integration, and basic documentation.
Monthly lease includes standard support: incident response, software maintenance (Llama/OS updates), operations support, and RAG re-indexing.
The real value of on-premise AI lies in how you combine data and frame questions.
Free Demo
See RAG cite internal documents and generate answers in real time.
Requirements Consultation
We answer questions on security compliance, system integration, scalability, and costs.
Is data sent outside our organization?
No. AI runs entirely locally—no internet required. Leakage risk is minimized.
What models are supported?
Open models such as Llama and Mistral. Custom models available by use case.
Can it integrate with existing systems?
Yes. Depending on plan: file servers, major DBs, Slack, Teams, etc.