Built for people who value privacy, freedom, and control over their AI.
Fine-tune the model on your own data without corporate restrictions. Get direct, unfiltered answers to your questionsโno more "I can't help with that" messages.
Your conversations never leave your computer. No data sent to OpenAI, Google, or anyone else. Perfect for sensitive documents, personal journals, or confidential work.
No $20/month ChatGPT Plus. No API bills that spike unexpectedly. One-time setup, unlimited usage forever. Your electricity, your AI.
Upload your PDFs, notes, research papers, or personal files. Ask questions about them in natural language. The AI actually reads YOUR documents.
No internet required after setup. Use it on airplanes, in remote locations, or when your WiFi dies. Your AI is always available.
Fine-tune the model to speak like you, understand your industry jargon, or focus on topics you care about. Make it truly personal.
Complete technical breakdown for developers. All values extracted from actual source code.
* What you sacrifice for complete privacy and zero recurring costs.
No coding experience required. Just follow the steps.
Open your terminal (Command Prompt on Windows, Terminal on Mac/Linux) and run:
git clone https://github.com/Akshar-Guha/Chat_Bot.git
cd Chat_Bot
Create an isolated environment so it doesn't mess with other Python projects:
# Windows
python -m venv .venv
.venv\Scripts\activate
# Mac/Linux
python3 -m venv .venv
source .venv/bin/activate
You'll see (.venv) appear in your terminal when it's active.
This downloads all the required libraries:
pip install -r requirements.txt
This may take a few minutes. You'll see a lot of text scrollingโthat's normal.
First, make sure Ollama is running (open the Ollama app). Then download the Llama model:
ollama pull llama3.2:1b
This downloads ~1.3GB. The model will be stored locally and works offline after this.
Put your PDF, TXT, or DOCX files in the data/ folder, then index them:
python -m src.core.cli ingest ./data/
Launch the API server:
python -m src.api.main
Open http://localhost:8000/docs in your browser to see the API.
For developers who want to fine-tune Llama on their own data. Requires Google Colab (free tier works).
Create a JSON file with your training examples in this format:
[
{
"instruction": "Summarize the following document",
"input": "Your document text here...",
"output": "The expected summary..."
},
{
"instruction": "Answer this question about the document",
"input": "What is the main topic?",
"output": "The main topic is..."
}
]
Need 500-5000 examples for good results. Quality matters more than quantity.
Key settings in the notebook (already optimized for T4 GPU):
# LoRA Configuration
lora_r = 8 # Rank (higher = more capacity, more VRAM)
lora_alpha = 16 # Scaling factor
lora_dropout = 0.1 # Regularization
# Quantization (saves VRAM)
load_in_4bit = True # 4-bit NF4 quantization
bnb_4bit_compute_dtype = "float16"
# Training
num_epochs = 3
batch_size = 2 # Increase if you have more VRAM
gradient_accumulation = 8 # Effective batch = 2 ร 8 = 16
learning_rate = 2e-4
Execute all cells in the notebook. Training will:
Training ~1K samples takes about 2 hours on T4. Watch the loss curveโit should decrease.
After training, download the adapter and merge with the base model:
# In the notebook, after training completes:
# The adapter is saved to Google Drive automatically
# On your local machine:
# 1. Download the adapter folder from Drive
# 2. Merge with Ollama:
ollama create my-custom-model -f Modelfile
500 high-quality examples beats 5000 noisy ones. Clean your data carefully.
Training loss should decrease. If it plateaus, try lower learning rate.
Colab sessions disconnect. Enable Drive auto-save in the notebook.
Train for 1 epoch, test, then continue. Don't train blindly for 10 epochs.
No more sending your data to the cloud. No more content filters. Your AI, your rules.