The Conversation Loop
Multi-turn terminal chatbot — the bot's context window grows with every message, so it can reference anything said earlier in the session.
How Multi-Turn Memory Works
The key insight: the Gemini API is stateless — it doesn't remember anything between calls. So you have to build memory yourself by appending every message to a list and sending the entire conversation history with each new request.
input(). Loop checks for "quit" command before processing.conversation_history list as a {"role": "user", ...} dict.model.generate_content() as context."model" role — ready for the next turn.The Full Code
Click a section to explore the implementation. Every part of the chatbot in about 60 lines of Python.
API Setup & Initialization
The chatbot uses google-generativeai, Google's official Python SDK for Gemini. The API key is loaded from an environment variable — never hardcoded — using os.getenv().
The GenerativeModel is initialized once with the system instruction baked in — this sets the bot's persona before the first message is ever sent.
import google.generativeai as genai import os # Load API key from environment variable # Never hardcode keys — use .env or system env genai.configure( api_key=os.getenv("GEMINI_API_KEY") ) # Initialize the model with a system prompt # The system_instruction shapes ALL responses model = genai.GenerativeModel( model_name="gemini-2.0-flash", system_instruction=get_system_prompt() ) # Start with an empty conversation history conversation_history = [] print("✦ Gemini Chatbot — type 'quit' to exit") print("─" * 42)
The Conversation Loop
A simple while True loop handles the entire conversation. It captures user input, checks for the exit command, appends to history, calls the API, prints the response, and appends the reply — then repeats.
The try/except block handles API errors gracefully — if the network drops or the API returns an error, the loop continues rather than crashing the program.
while True: # Get user input user_input = input("\nYou > ").strip() # Exit condition if user_input.lower() in ['quit', 'exit', 'bye']: print("\n✦ Conversation ended. Goodbye!\n") break if not user_input: continue # Append user message to history conversation_history.append({ "role": "user", "parts": [user_input] }) try: # Send full history to the model response = model.generate_content( conversation_history ) bot_reply = response.text # Print formatted response print(f"\nBot > {bot_reply}\n") # Append bot reply to history conversation_history.append({ "role": "model", "parts": [bot_reply] }) except Exception as e: print(f"\n⚠ API error: {e}\n")
Conversation Memory
The conversation_history list grows with every turn. Each entry is a dict with a role ("user" or "model") and parts (a list with the message text).
Gemini requires this exact format. Sending the full list means the model has complete context — it can reference anything said in the current session, making follow-up questions and references work naturally.
# What conversation_history looks like # after 2 turns: conversation_history = [ { "role": "user", "parts": ["hey, what can you help with?"] }, { "role": "model", "parts": ["I can help you brainstorm, answer... (full reply)"] }, { "role": "user", "parts": ["explain APIs in simple terms"] }, # ↑ This entire list is sent on every call # Gemini sees the full conversation context ] # Without this, the bot would have no memory: # "What did I just say?" → "I'm not sure..." # With this, it tracks the whole thread.
System Prompt / Persona
The system instruction is passed directly to GenerativeModel at initialization. It shapes every single response — tone, format, expertise, and constraints — without the user ever seeing it.
This is where prompt engineering matters most: a well-written system prompt turns a generic LLM into a focused assistant with consistent behavior.
def get_system_prompt(): return """ You are a helpful AI assistant built in Python using the Gemini API as a class project. Your personality: - Clear and friendly, never condescending - Use plain language unless asked for technical depth - Give concise answers unless asked to elaborate - Use analogies when explaining complex concepts Capabilities: - Answer general knowledge questions - Explain programming and tech concepts - Help brainstorm ideas or outlines - Review or discuss short pieces of writing Constraints: - You do not have internet access mid-session - Acknowledge when a question is outside your scope - Never fabricate citations or specific statistics At the start of each session you are fresh — you have no memory of past sessions, only the current conversation history. """
By the Numbers
{"role", "parts"}
dicts) required for multi-turn context.
chatbot.py,
.env.example, and
requirements.txt — available on request.