Simplify SQL Automation with CrewAI: A Comprehensive Guide

Simplify SQL Automation with CrewAI: A Comprehensive Guide

SQL is easily one of the most important languages in the computer world. It serves as the primary means of communication with relational databases, where crucial data of most organizations is stored. SQL plays a significant role including analyzing complex data, creating data pipelines, and efficiently managing data warehouses. However, writing optimized SQL queries can often be challenging and cumbersome. But thanks to the rapid progress in AI in the past few years, we now have AI agents augmented with Large Language Models capable of writing queries on our behalf.

This article demonstrates how to build an AI agent using CrewAI, Composio, and GPT-4o to access databases and execute SQL queries to retrieve data.

Learning Objectives

  • Understand what CrewAI is.

  • Learn about Composio tools and integrations.

  • Learn how to build an SQL agent using Composio and CrewAI.

What is CrewAI?

CrewAI is an open-source collaborative multi-agent framework. It lets you build a crew of AI agents with various tasks, tools, roles, and motivations akin to a real-world crew. CrewAI manages the flow of information from one agent to another, letting you build autonomous efficient agentic workflows.

CrewAI mainly consists of five core features Agents, Tasks, Tools, Processes, and Tasks.

  • Agents: Agents operate as autonomous entities tasked with reasoning, delegating tasks, and communicating with fellow agents, much like a team in the real world.

  • Tasks: Tasks are precise assignments allocated to agents. They outline the steps and actions required for an agent to achieve a specific goal.

  • Tools: Tools equip agents to carry out tasks that exceed the capabilities of LLMs, such as web scraping, email responses, and task scheduling.

  • Process: In CrewAI, processes manage the execution of tasks by agents, ensuring that tasks are allocated and performed systematically. These processes can be sequential, where tasks are completed one after another, or hierarchical, where tasks are carried out based on a tiered authority structure.

  • Crews: Crews within CrewAI consist of collaborative agents equipped with tasks and tools, all working together to accomplish complex tasks.

Here is a mind-map CrewAI.

Image showing Crew AI process

What is Composio?

Composio is an open-source platform that provides tooling solutions for building reliable and useful AI agents. Composio provides over 150 tools and applications with built-in user authentication and authorization to help developers build reliable, secure, and production-ready agentic workflows. The tools have been designed from the ground up keeping real-world readiness of agents in mind.

Composio offers several advantages over other tooling solutions, including managed user authentication and authorizations, a wide array of tools, a dashboard for monitoring integrations, and the flexibility to add custom tools.

Composio consists of four key components

  • Entities: In Composio, an "entity" is a container for all user or organization accounts and tools, allowing centralized management from a single dashboard.

  • Integrations: These are tool configurations, such as permissions and OAuth ClientID/Secret, used to connect user accounts to Composio. You can use your configurations or opt for Composio's default integrations.

  • Actions: are tasks performed by integrated tools, like sending a Slack message or scheduling a calendar event.

  • Triggers: Triggers are predefined conditions that, when met, initiate webhooks to your agents, containing event details such as entities, message text, and more.

Building the SQL Agent

Now, that the basics are covered, we can start with the coding part. This GitHub repository contains codes for building an SQL agent using major frameworks like LangChain, CrewAI, and Llama Index. This walk-through guide will focus on the CrewAI implementation. You can check out the rest as well.

As with any Python project, we will first set up a virtual environment and environment variables, and install libraries.

Step 1: Installing Libraries

Create a virtual environment using Python Venv.

python -m venv sqlagent
cd sqlagent
source bin/active

Install the following libraries using pip install .

composio-core
composio-langchain
composio-crewai
dotenv
langchain-openai

Create a .env file and add OPENAI_API_KEY and COMPOSIO_API_KEY as variables.

Run composio login to log in to your Composio account.

Note: To get the Composio API key, run the composio whoami and assign it to the variable in .env file.

Step 2: Defining LLMs

Now, import all the libraries and define the LLM. We will use GPT-4o.

import os
import sqlite3

import dotenv
from composio_langchain import Action, App, ComposioToolSet
from crewai import Agent, Crew, Process, Task
from langchain_openai import ChatOpenAI

from composio.local_tools import filetool, sqltool

# Load environment variables from .env file
dotenv.load_dotenv()

# Initialize the ChatOpenAI model with GPT-4 and API key from environment variables
llm = ChatOpenAI(model="gpt-4o", openai_api_key=os.environ["OPENAI_API_KEY"])

Get the SQL tool and a file tool from Composio. This is how you can define any tools with Composio.

# Initialize the ComposioToolSet
toolset = ComposioToolSet(api_key=os.environ["COMPOSIO_API_KEY"])

# Get the SQL and file tools from the ComposioToolSet
tools = toolset.get_tools(apps=[App.SQLTOOL, App.FILETOOL])
file_tool = toolset.get_tools(apps=[App.FILETOOL])

Step 3: Defining Agents

CrewAI allows us to define Agents that can assume roles, and have background descriptions, and motivation. This gives the LLMs additional context before executing an action.

Here, we will define an SQL query executor agent and a file writer agent to log all the SQL queries executed in a log.txt file.

# Define the Query Executor Agent
query_executor_agent = Agent(
    role="Query Executor Agent",
    goal="""Execute the SQL query and return the results.
    After execution of a query evaluate whether the goal given by the user input is achieved. If yes, stop execution
    """,
    verbose=True,
    memory=True,
    backstory=(
        "You are an expert in SQL and database management, "
        "skilled at executing SQL queries and providing results efficiently."
    ),
    llm=llm,
    allow_delegation=False,
    tools=tools,
)

# Define the File Writer Agent
file_writer_agent = Agent(
    role="File Writer Agent",
    goal="""Document every SQL query executed by the Query Executor Agent in a 'log.txt' file.
    Perform a write operation in the format '<executed_query>\\n'
    The log should have the record of every SQL query executed """,
    verbose=True,
    memory=True,
    backstory=(
        "You are an expert in logging and documenting changes to SQL databases."
    ),
    llm=llm,
    allow_delegation=False,
    tools=file_tool,
)

Step 4: Defining Tasks

In this step, define tasks to be performed by the above agents. Make it as meticulous and detailed as possible to let the agents execute it reliably. Each task includes a description, the expected output, and the agent responsible for accomplishing it.

# User-provided description of the database and input query
user_description = "The database name is company.db"  # Edit the description for your database and tables
user_input = "fetch the rows in the products table"  # Edit the input for the action you want it to perform

# Define the task for executing the SQL query
execute_query_task = Task(
    description=(
        "This is the database description:"
        + user_description
        + "form a SQL query based on this input:"
        + user_input
        + "Execute the SQL query formed by the Query Writer Agent, "
        "and return the results. Pass the query and connection string parameter."
        "The connection string parameter should just be of the format <database_name_given>.db"
    ),
    expected_output="Results of the SQL query were returned. Stop once the goal is achieved",
    tools=tools,
    agent=query_executor_agent,
)

# Define the task for writing the executed SQL query to a log file
file_write_task = Task(
    description=(
        "the executed query has to be written in a log.txt file to record history, "
        "overwrite parameter should be set to false"
    ),
    expected_output="Executed query was written in the log.txt file",
    tools=file_tool,
    agent=file_writer_agent,
    allow_delegation=False,
)

In the above code snippet, we assumed the database to be company.db. Replace the database name and query as per your needs.

Step 5: Define the Crew and Run it

The only thing that remains to be done is to define the crew and run the workflow. This is where we define the process parameter. We need the information to flow sequentially from one agent to another. So, we set it to sequential.

# Define the crew with the agents and tasks
crew = Crew(
    agents=[query_writer_agent, file_writer_agent],
    tasks=[execute_query_task, file_write_task],
    process=Process.sequential,
)

# Kickoff the process and print the result
result = crew.kickoff()
print(result)

Now, run the script.

Infographic showing Script for building SQL Agent with help of Crew AI

Next Steps

In this tutorial, you developed an SQL agent using CrewAI, Composio, and GPT-4o. An AI agent that can access a relational database, execute user-defined queries, and log all the executed queries to keep track of them. However, you can go a step further and add a code interpreter tool and a code-writing agent to display graphs and plots of retrieved data. Composio offers numerous tools and integrations that allow you to expand the capabilities of your agentic workflows to accomplish complex tasks.