Skip to content
View caio-moliveira's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report caio-moliveira

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
caio-moliveira/README.md

Hi 👋, I'm Caio Oliveira

Senior AI Engineer · Data Engineering Instructor · AI Systems Builder

I design and ship production AI systems that transform complex documents and data into reliable decisions.


🧭 Executive Profile

AI Engineer focused on LLM systems, autonomous agents, and data-intensive AI platforms. I work across the full lifecycle: architecture, retrieval engineering, orchestration, observability, and production deployment.

My specialization is building high-impact AI solutions in regulated and document-heavy environments, combining:

  • RAG architectures (retrieval quality, chunking strategy, embeddings, vector search)
  • Agentic workflows (tool use, memory, stateful orchestration)
  • Data/ML engineering foundations (pipelines, reliability, performance, governance)

💼 Professional Experience

Senior AI Engineer · Tribunal de Contas do Estado de Minas Gerais (TCEMG)

Jul 2025 – Present

  • Leading the implementation of AI initiatives across public-sector business areas, including the institution’s first large-scale AI programs.
  • Co-developed an intelligent legal/administrative assistant with LangChain, LangGraph, and LlamaIndex, integrating RAG, OCR, and memory-enabled agents.
  • Built AI auditing agents that analyze approximately 1M documents per year, enabling complete annual processing for the first time in the institution’s history.

Technical Monitor & Instructor (Data Engineering / AI Engineering) · Jornada de Dados

Feb 2025 – Present

  • Delivering advanced workshops and bootcamps covering RAG design, Transformer foundations, and LLM implementation for business use cases.
  • Producing technical content and mentoring engineers through practical projects in semantic search, vector databases, and intelligent data workflows.

🚀 Featured Repositories

  • ai-engineer-roadmap Structured learning path for AI Engineers, from fundamentals to production architecture.

  • workshop-ai-agent Practical workshop repository focused on building AI agents with orchestration, tools, and real-world workflows.

  • rag-project End-to-end RAG pipeline using LangChain, LangGraph, Qdrant, and Langfuse with a production-oriented architecture.


🧠 Core AI Engineering Skills

LLM & Agent Systems

  • Retrieval-Augmented Generation (RAG) architecture and optimization
  • Agent design with tool calling, memory, and multi-step reasoning flows
  • Prompt/system design for robustness, consistency, and safety

Retrieval & Knowledge Infrastructure

  • Chunking strategy design (semantic, fixed, and hybrid)
  • Embedding model evaluation and retrieval quality tuning
  • Vector search pipelines with hybrid retrieval and reranking patterns

Data & Platform Engineering for AI

  • ETL/ELT for AI-ready datasets (batch and near real-time)
  • API-first backend services for AI products
  • Monitoring and evaluation for LLM applications (latency, quality, traceability)

Production & Scale

  • Deployment patterns for cloud-native AI systems
  • Cost/performance tradeoff analysis for model and infrastructure choices
  • Reliability and observability in mission-critical AI workflows

🛠️ Technology Stack

AI / LLM Ecosystem

LangChain · LangGraph · LlamaIndex · OpenAI API · Qdrant · Langfuse

Backend & Services

Python · FastAPI · Pydantic · Uvicorn

Data Engineering

Pandas · NumPy · dbt · Apache Airflow

Databases & Storage

PostgreSQL · MongoDB · Redis · Snowflake

Cloud & DevOps

AWS · GCP · Azure · Docker · Git · Linux


🎯 Current Focus

  • Advancing agentic AI systems for large-scale document intelligence in the public sector
  • Mentoring engineers to move from data foundations into production AI engineering
  • Building reusable patterns for trustworthy, observable, and scalable LLM applications

Open to collaborations on AI Engineering, RAG platforms, and agent-based automation systems.

Pinned Loading

  1. sales-pipeline-project sales-pipeline-project Public

    This project demonstrates an end-to-end data pipeline, integrating cloud storage, data processing, and real-time visualization. It serves as a practical foundation for similar data engineering and…

    Python 14 3

  2. dbt-snowflake-project dbt-snowflake-project Public

    This repository implements a fully automated data pipeline integrating AWS S3, Snowflake, DBT, Apache Airflow, and Streamlit. It handles data ingestion, transformation, and visualization, providing…

    Python 1 1

  3. airflow-astro-project airflow-astro-project Public

    This project successfully demonstrates the use of Astro CLI for managing and deploying an Airflow DAG that interacts with an external API and stores data in PostgreSQL. This setup exemplifies how o…

    Python 1

  4. read-files-to-dataframe read-files-to-dataframe Public

    This project showcases how object-oriented principles like hierarchy, polymorphism, and encapsulation can improve data handling and processing. Python, SQLite, and Poetry for dependency management,…

    Python

  5. databricks-duckdb-1billion-rows databricks-duckdb-1billion-rows Public

    Python

  6. ai-agent-service ai-agent-service Public

    AI Agent robot that is able to connect to your database and answers your questions about the data at the same time that generates the query for the answe!

    Python 9 4