AutoAgent

에이전트 엔지니어링을 위한 자동화 도구. AI 에이전트에게 과제를 주면 하네스를 자율적으로 구축하고 반복 개선한다.

Key Points

자기 설정 에이전트: 시스템 프롬프트, 도구, 에이전트 설정, 오케스트레이션을 자동으로 수정
벤치마크 기반 최적화: 변경 후 벤치마크 실행 → 점수 확인 → 채택/폐기 반복
program.md에 지시사항을 작성하면 메타 에이전트가 에이전트 엔지니어링 루프를 실행

How It Works

agent.py — 테스트 대상 하네스 (config, tools, registry, routing, Harbor adapter)
program.md — 메타 에이전트에 대한 지시사항 (사용자가 편집)
tasks/ — Harbor 형식의 평가 과제
.agent/ — 재사용 가능한 프롬프트, 스킬 등

Quick Start

# 의존성 설치
uv sync
 
# 환경변수 설정
cat > .env << 'EOF'
OPENAI_API_KEY=***
EOF
 
# 베이스 이미지 빌드
docker build -f Dockerfile.base -t autoagent-base .
 
# 단일 벤치마크 실행
rm -rf jobs; mkdir -p jobs && uv run harbor run -p tasks/ --task-name "<task-name>" -l 1 -n 1 --agent-import-path agent:AutoAgent -o jobs --job-name latest

Content

Kevin Gu(Thirdlayer)가 개발한 AutoAgent는 “autoresearch for agent engineering” 개념을 실현한 도구다. 사용자가 program.md에 원하는 에이전트의 스펙을 정의하면, 메타 에이전트가 하네스 코드를 수정하고 벤치마크 점수를 기준으로 hill-climbing을 수행한다. Harbor 프레임워크를 벤치마크 실행 환경으로 사용한다.

Sources

Repository: kevinrgu/autoagent
Presentation: Kevin Gu on X

Context Vault

탐색기

AutoAgent

AutoAgent

Key Points

How It Works

Quick Start

Content

Sources

그래프 뷰

목차