Hands On with Cline & Kilo Code • Pipitone Labs

Introduction#

As a cloud engineer focused on leveling up my Kubernetes and modern DevOps skills, I’m intentionally building real hands-on experience with agentic AI - autonomous agents that can reason step-by-step, orchestrate, and execute complex workflows with almost no hand-holding. I’m convinced that if I don’t start actively using these tools, I’ll fall behind fast.

That’s why I’m experimenting with several different solutions and use cases, to see if I can incorporate them into my day-to-day, and if they improve my workflows. So far, it’s been a great experience when it comes to managing my Kubernetes cluster in my HomeLab, and has been a big time saver. I’ve run a few personal experiments to learn (and move) even faster.

I wanted to share my experience with two top tools: the VSCode extensions Cline and Kilo Code.

I tested them on different scenarios: Kubernetes app deployments and configurations, Terraform module refactoring, and custom development of various web apps. My testing focused on autonomy, accuracy, IDE and CLI integration, and cost.

My Testing Process#

I have watched various YouTube videos of people who have done extreme benchmarking as well as extremely thorough test cases, which was a good starting point for me to get familiar with.

What I realized is - like many other things - the only way to determine which tool is “best” is to get in there, and see for myself. Which one works best for me?

Tasks by complexity:

Simple → Lint/syntax fixes in YAML/Terraform.
Medium → Multi-file refactors and Terraform modularization.
Complex → Full architecture design + implementation.

What I’m really judging it on:

How much do I have to babysit it?
Are there lots of “API Request Failed” errors and how does it impact cost?
Does the code it spits out actually work and follow best practices?
Is the result something I can ship to prod?
Does it fit smoothly into my normal day-to-day flow, or am I fighting the tool the whole time?
When it screws up, does it hallucinate wildly, lose context, get stuck in loops, delete file content and how easy is it to get back on track?

Cline: Open-Source Coding Agent#

Cline ↗ is a pioneering open-source VS Code extension that acts as a true coding agent — it can read/write files, run terminal commands, control your browser, and handle complex multi-step tasks, always with human approval.

Strengths

Multimodal (understands screenshots and images)
Configure it to use APIs of your choice, and supports OpenRouter, etc.
Free options: various free models such as grok-code-fast-1, minimax-m2, devstral-2512 (at the time of this writing).

Weaknesses

Can struggle with very large codebases
Occasionally overconfident or stuck in loops on vague prompts
Requires API keys and minor setup/tweaking
Seems to only have Plan and Act modes
A bit more difficult to configure MCP servers
Doesn’t support codebase indexing - but that might not be a bad thing ↗

A powerful, flexible agent that still respects your control - one of the best open-source options available.

Example: “Design and architect a modular solution written in terraform to provision a serverless solution in AWS ECS that is highly available and fault tolerant. This should use fargate and support blue/green deployments.”

Output: Detailed plan containing an architecture overview, details on ECS Fargate, ALB, CodeDeploy, CloudWatch, and ASG. Additional details are provided related to the separate Terraform modules. Additional details on how the blue/green deployment flow works. Cline then presents me with “Would you like me to proceed with implementing this architecture, or would you prefer to discuss any modifications to the design first?”

Kilo Code: A Newer Open-Source Coding Agent#

Kilo Code ↗ is a polished, powerful open-source agentic extension for VS Code and JetBrains IDEs - a fork of Roo Code, which itself is a fork of Cline. It combines semantic codebase search, precise diff-based edits, terminal/browser automation, and multi-agent orchestration.

Strengths

Excellent codebase understanding with targeted, minimal edits
Smooth tool chaining (search → read → plan → diff → test)
Architect Mode dedicated high-level planning: outlines project structure, recommends libraries, suggests modular architectures, and guides implementation strategy
Strictly follows best practices with clear, well-documented outputs
Model-agnostic: works with Kilo Gateway (its own API proxy with generous $20 bonus credits on first top-up), OpenRouter, or local setups
Free options: lots of free options - I get emails about these frequently as models are released
I am able to configure codebase indexing (I don’t fully understand how this works yet)

Weaknesses

Steeper learning curve for custom modes and rules
Can be token-heavy on very large repositories
I’ve been getting a lot of “API Request Failed” errors recently which is extremely frustrating as the API requests have costs associated

A versatile solution that takes the Cline foundation and pushes it further - especially for architecture-driven workflows.

It was relatively easy to deploy Ollama ↗ and Qdrant ↗ in my Kubernetes cluster, so I setup indexing for testing purposes.

Custom Modes for Specialized Workflows

I created these custom modes to complement the core modes (architect, code, ask, debug, orchestrator):

Documentation Specialist: Writing and maintaining project documentation, README files, and API docs. Only edits markdown files to prevent accidental code changes.
Test Engineer: Writing tests, debugging test failures, and improving test coverage. Restricted to test files to maintain focus.
Frontend Specialist: Building React, TypeScript, CSS components, styling, and frontend logic. Limited to frontend files to avoid backend concerns.
Code Reviewer: Analyzing code for reviews, security audits, or learning. No edit permissions ensure you can’t accidentally modify anything.

Mode Definitions and Usage#

Here’s how I configured these modes in the Kilo Code extension:

Documentation Specialist

Role Definition: You are a technical writing expert specializing in clear, comprehensive documentation. You excel at explaining complex concepts simply and creating well-structured docs.

Short description: Perfect for technical writers and documentation maintainers

Use case: Writing and maintaining project documentation, README files, and API docs. Only edits markdown files to prevent accidental code changes.

Custom instructions: Focus on clarity, proper formatting, and comprehensive examples. Always check for broken links and ensure consistency in tone and style.
Test Engineer

Role Definition: You are a QA engineer and testing specialist focused on writing comprehensive tests, debugging failures, and improving code coverage.

Short description: Dedicated to code quality and testing

Use case: Writing tests, debugging test failures, and improving test coverage. Restricted to test files to maintain focus.

Custom instructions: Prioritize test readability, comprehensive edge cases, and clear assertion messages. Always consider both happy path and error scenarios.
Frontend Specialist

Role Definition: You are a frontend developer expert in React, TypeScript, and modern CSS. You focus on creating intuitive user interfaces and excellent user experiences.

Short description: Focused on UI/UX implementation

Use case: Building React components, styling, and frontend logic. Limited to frontend files to avoid backend concerns.

Custom instructions: Prioritize accessibility, responsive design, and performance. Use semantic HTML and follow React best practices.
Code Reviewer

Role Definition: You are a senior software engineer conducting thorough code reviews. You focus on code quality, security, performance, and maintainability.

Short description: Read-only mode for safe code analysis

Use case: Analyzing code for reviews, security audits, or learning. No edit permissions ensure you can’t accidentally modify anything.

Custom instructions: Provide constructive feedback on code patterns, potential bugs, security issues, and improvement opportunities. Be specific and actionable in suggestions.

Example Usage Patterns#

I used these modes in combination with the core modes for different workflows:

Architecture + Documentation: Start with Architect mode to design the system, then switch to Documentation Specialist to document the architecture.
Feature Development: Use Code mode for implementation, then Test Engineer for test coverage, and Frontend Specialist for UI components.
Code Review: Use Code Reviewer mode to analyze changes before merging.
Debugging: Use Debug mode to identify issues, then Test Engineer to add tests - I used Go.

Switching Between Models and Modes#

One of the key advantages I found with Kilo Code was the ability to switch between different models and modes seamlessly. For example:

Use a faster, cheaper model like x-ai/grok-code-fast-1 for initial exploration and planning
Switch to a more powerful model like anthropic/claude-sonnet-4.5 (or Opus) for backend implementation
Switch to a model a bit more capable of frontend / design like google/gemini-3-pro-preview (if building a web app)
Use specialized modes like Documentation Specialist for documentation tasks (pick any model of your choice)
Switch to Code Reviewer mode for final validation

This flexibility allowed me to optimize both cost and quality, using the right tool for each specific task.

Example prompt 1: “Refactor a legacy Terraform module into reusable components with comprehensive documentation.”

Example prompt 2: “Design and architect a SaaS based solution that does X, Y, and Z. For backend tasks, number the tasks with B1, B2, etc. For frontend tasks, number them as F1, F2, etc.”

Workflow:

Architect mode: Design the new module structure
Code mode: Implement the refactoring
Documentation Specialist mode: Create comprehensive documentation
Test Engineer mode: Add tests
Code Reviewer mode: Final validation

Output: Clean, modular Terraform with excellent docs and tests. Required minimal follow-ups thanks to the structured approach.

I have not yet tested: Orchestrator: Coordinate tasks across multiple nodes.

Additional Experiments: AGENTS.md, Speckit and Kilo Code Memory Bank#

AGENTS.md#

AGENTS.md ↗ is a standardized, project-root Markdown file that provides AI coding agents (including Kilo Code) with persistent, human-readable guidelines on coding style, architecture, workflow, and setup instructions. Refer to this example ↗ to see how I’m using it in this project.

GitHub’s Spec Kit#

I briefly experimented with GitHub’s Spec Kit ↗, a toolkit for generating and validating code/API specifications in agentic workflows. It excels at automated spec testing via GitHub Actions and integrates with VSCode for spec-driven development. Feedback that I’ve heard is that this can really improve your codebase, but it adds unnecessary token overhead.

Kilo Code Memory Bank#

Kilo Code’s memory bank feature enables persistent context storage across agent sessions. Define memory sections which agents retrieve. This maintains long-term knowledge, such as user preferences or project constraints, reducing repetition.

Refer to this example ↗ to see how I’m using the memory bank in my Kubernetes repo.

My Current Workflow:

Kilo Code for implementation (with custom modes for specialized tasks)
Cline as a backup in case API errors increase with Kilo Code
Planning to evaluate Claude Code next for potential consolidation

Closing Thoughts#

The ability to switch between different models and specialized modes in Kilo Code has been awesome, allowing me to optimize for both cost and quality depending on the task at hand. While I’m interested in exploring Claude Code for its potential as a unified platform, the flexibility of Kilo Code’s multi-model and multi-mode approach has been excellent for my current workflow.

And also, it’s mostly free (for the moment) but allows you to pay-as-you-go. I have some credits reserved for scenarios when I might want to use Sonnet 4.5, Opus 4.5, or Gemini 3 Pro.

Thanks for reading. This is all new to me and I’m learning as I go, so of course this may not be the most accurate write-up of the century, but it works for me!