Create a PM task automation agent with local LLM + Claude Code

The most common task as a PM is repetitive document work. Summarizing meeting minutes, drafting specifications, organizing issue priorities... how much different would it be if these routines were automated with an AI agent? I implemented it myself. It is a PM workflow agent that combines local LLM (Ollama + Mistral) and Claude Code.

Why Local LLM?

Cloud API is convenient, but sending company internal documents or meeting contents to an external server poses security issues. Local LLMs include:

Security: Operates without internet and sensitive internal data does not go out.
Cost: There is no API call cost. The more repetitive tasks, the greater the savings.
Speed: Fast response in local workflows with no network latency
Offline: You can run the agent without an internet connection.

However, local LLM has limited reasoning ability compared to cloud API. For tasks that require long context processing, complex code generation, and precise judgment, the key is to delegate roles to Claude Code.

Ollama installation and model settings

Install Ollama

Download the macOS/Linux/Windows installation file from ollama.com or install with the command below.

# macOS
brew install ollama
# Linux
curl -fsSL https://ollama.com/install.sh | sh

After installation, run the server.

ollama serve

The default port is 11434, and the REST API opens at http://localhost:11434.

Model selection guide

Local LLM has a large trade-off between model size and performance. A comparison of practical models for PM task automation purposes is as follows.

model	size	Memory Requirements	Korean language support	Use
mistral:7b	4.1GB	8GB RAM	Normal	General Summary, Classification
llama3.1:8b	4.7GB	8GB RAM	Normal	universal
gemma3:12b	8.1GB	16GB RAM	Good	Handling long documents
qwen2.5:14b	9GB	16GB RAM	Excellent	Korean-focused work
llama3.1:70b	40GB	64GB RAM	Excellent	High-quality inference
If you are a PM who mainly deals with Korean documents, `qwen2.5:14b` is a practical choice.

# 모델 다운로드 (최초 1회)
ollama pull qwen2.5:14b
# 실행 테스트
ollama run qwen2.5:14b "PRD란 무엇인가?"

Agent overall architecture

graph TD
A["PM 입력 (회의록 / 이슈 / git log)"] --> B{"라우터 (router.py)"}
B -->|단순 요약/분류| C["로컬 LLM (Ollama + qwen2.5)"]
B -->|복잡한 판단/코드| D["Claude Code (Claude API)"]
C --> E["후처리 (parser.py)"]
D --> E
E --> F[출력]
F --> G[Notion]
F --> H[GitHub Issues]
F --> I[Slack]

The agent is largely composed of three layers.

Router: Determines the complexity of the input task and decides whether to send it to local LLM or Claude Code.
Processor: Performs actual work with prompts appropriate for each LLM.
Sink: Export results to PM tools such as Notion, GitHub, and Slack.

Environment settings

mkdir pm-agent && cd pm-agent
python -m venv .venv && source .venv/bin/activate
pip install ollama anthropic python-dotenv notion-client slack-sdk

# .env
ANTHROPIC_API_KEY=sk-ant-...
NOTION_TOKEN=secret_...
NOTION_DATABASE_ID=...
SLACK_BOT_TOKEN=xoxb-...
SLACK_CHANNEL_ID=C0...

Meeting minutes summary agent

When meeting minutes are entered, decisions, action items, and risks are structured and extracted.

# agents/meeting_summarizer.py
import ollama
PROMPT_TEMPLATE = """다음 회의록을 PM 관점에서 분석해줘.
아래 형식으로 정확히 출력해:
## 결정 사항
- (결정된 내용을 bullet로)
## 액션 아이템
| 항목 | 담당자 | 기한 |
|---|---|---|
| ... | ... | ... |
## 리스크 / 미결 사항
- (리스크나 추가 논의 필요 항목)
회의록:
{transcript}"""
def summarize_meeting(transcript: str) -> str:
response = ollama.chat(
model='qwen2.5:14b',
messages=[{
'role': 'user',
'content': PROMPT_TEMPLATE.format(transcript=transcript)
}]
)
return response['message']['content']
if __name__ == '__main__':
sample = """
참석: 김개발, 박디자인, 이PM
일시: 2026-05-20
이PM: 이번 스프린트 결제 모듈 오류 원인 파악됐나요?
김개발: 네, PG사 웹훅 응답 타임아웃이 원인이었습니다. 재시도 로직 추가하면 됩니다.
이PM: 기한은요?
김개발: 수요일까지 가능합니다.
박디자인: 에러 화면 UI도 같이 수정할게요. 목요일 드리겠습니다.
이PM: 좋습니다. 이번 주 금요일 핫픽스 배포 목표로 진행하죠.
"""
print(summarize_meeting(sample))

PRD draft auto-generation agent

Enter user feedback or issue notes and a PRD draft will be created. This task uses the Claude API because the context is long and requires structured output.

# agents/prd_drafter.py
import anthropic
client = anthropic.Anthropic()
PRD_SYSTEM_PROMPT = """당신은 경험 많은 Technical PM입니다.
입력된 문제 설명을 바탕으로 PRD(Product Requirements Document) 초안을 작성합니다.
PRD는 다음 섹션을 포함해야 합니다:
1. Problem Statement
2. Goals & Success Metrics
3. User Stories
4. Scope (In / Out of Scope)
5. Edge Cases & Constraints
마크다운 형식으로 작성하세요."""
def draft_prd(problem_description: str) -> str:
message = client.messages.create(
model="claude-opus-4-5",
max_tokens=2048,
system=PRD_SYSTEM_PROMPT,
messages=[{
"role": "user",
"content": f"다음 문제를 해결하는 기능의 PRD를 작성해줘:\n\n{problem_description}"
}]
)
return message.content[0].text
if __name__ == '__main__':
problem = """
사용자들이 비밀번호를 자주 잊어버려서 고객센터 문의의 30%가 비밀번호 초기화 관련이다.
소셜 로그인(Google, Kakao)을 도입하면 이 문제를 해결하고 신규 가입 전환율도 올릴 수 있다.
"""
print(draft_prd(problem))

Issue Priority Classification Agent (MoSCoW)

Receives a list of GitHub Issues or Jira tickets and classifies them using the MoSCoW framework.

# agents/issue_prioritizer.py
import ollama
import json
MOSCOW_PROMPT = """다음 이슈 목록을 MoSCoW 프레임워크로 분류해줘.
분류 기준:
- Must Have: 서비스 운영에 필수, 없으면 출시 불가
- Should Have: 중요하지만 잠시 미룰 수 있음
- Could Have: 있으면 좋지만 우선순위 낮음
- Won't Have: 이번 스프린트에서 하지 않음
반드시 JSON 형식으로만 응답해:
{{
"must": ["이슈1", "이슈2"],
"should": ["이슈3"],
"could": ["이슈4"],
"wont": ["이슈5"]
}}
이슈 목록:
{issues}"""
def prioritize_issues(issues: list[str]) -> dict:
issues_text = '\n'.join(f'- {issue}' for issue in issues)
response = ollama.chat(
model='qwen2.5:14b',
messages=[{
'role': 'user',
'content': MOSCOW_PROMPT.format(issues=issues_text)
}],
format='json'
)
return json.loads(response['message']['content'])
if __name__ == '__main__':
issues = [
"결제 실패 시 사용자에게 에러 메시지 표시 안됨 (버그)",
"다크모드 지원",
"프로필 이미지 편집 기능",
"소셜 로그인 도입",
"앱 아이콘 리디자인",
"세션 만료 시 자동 로그아웃 처리 없음 (보안)",
"푸시 알림 개인화 설정",
]
result = prioritize_issues(issues)
for category, items in result.items():
print(f"\n[{category.upper()}]")
for item in items:
print(f"  - {item}")

Release Notes Auto-generated agent

Create user-friendly release notes based on git log.

# agents/release_note_generator.py
import subprocess
import ollama
def get_git_log(n: int = 30) -> str:
result = subprocess.run(
['git', 'log', f'--oneline', f'-{n}'],
capture_output=True, text=True
)
return result.stdout.strip()
RELEASE_PROMPT = """다음 git commit 목록을 바탕으로 릴리스 노트를 작성해줘.
규칙:
- 개발자 용어 대신 사용자 관점의 언어로 작성
- 🚀 신기능, 🐛 버그 수정, ⚡ 성능 개선, 🔒 보안 패치로 아이콘 구분
- 내부 리팩토링이나 CI 변경은 제외
- 간결하고 명확하게
git log:
{git_log}"""
def generate_release_notes(version: str = "v1.0.0") -> str:
git_log = get_git_log()
response = ollama.chat(
model='qwen2.5:14b',
messages=[{
'role': 'user',
'content': RELEASE_PROMPT.format(git_log=git_log)
}]
)
notes = response['message']['content']
return f"# Release Notes {version}\n\n{notes}"
if __name__ == '__main__':
print(generate_release_notes("v2.3.0"))

Task Router — Automatic selection of local LLM vs Claude

It is a router that automatically decides which LLM to use based on complexity.

# router.py
from enum import Enum
class TaskType(Enum):
SIMPLE_SUMMARY = "simple_summary"       # 로컬 LLM
ISSUE_CLASSIFICATION = "classification" # 로컬 LLM
RELEASE_NOTES = "release_notes"         # 로컬 LLM
PRD_DRAFT = "prd_draft"                 # Claude
COMPLEX_ANALYSIS = "complex_analysis"   # Claude
CODE_REVIEW = "code_review"             # Claude
LOCAL_TASKS = {
TaskType.SIMPLE_SUMMARY,
TaskType.ISSUE_CLASSIFICATION,
TaskType.RELEASE_NOTES,
}
CLAUDE_TASKS = {
TaskType.PRD_DRAFT,
TaskType.COMPLEX_ANALYSIS,
TaskType.CODE_REVIEW,
}
def route(task_type: TaskType, payload: str) -> str:
if task_type in LOCAL_TASKS:
return _run_local(task_type, payload)
elif task_type in CLAUDE_TASKS:
return _run_claude(task_type, payload)
else:
raise ValueError(f"Unknown task type: {task_type}")
def _run_local(task_type: TaskType, payload: str) -> str:
import ollama
response = ollama.chat(
model='qwen2.5:14b',
messages=[{'role': 'user', 'content': payload}]
)
return response['message']['content']
def _run_claude(task_type: TaskType, payload: str) -> str:
import anthropic
client = anthropic.Anthropic()
message = client.messages.create(
model="claude-opus-4-5",
max_tokens=2048,
messages=[{"role": "user", "content": payload}]
)
return message.content[0].text

Notion integration

Documents created by the agent are automatically saved in the Notion database.

# sinks/notion_sink.py
import os
from notion_client import Client
notion = Client(auth=os.environ["NOTION_TOKEN"])
DATABASE_ID = os.environ["NOTION_DATABASE_ID"]
def push_to_notion(title: str, content: str, doc_type: str = "PM Doc") -> str:
"""생성된 문서를 Notion 데이터베이스에 저장하고 페이지 URL을 반환한다."""
response = notion.pages.create(
parent={"database_id": DATABASE_ID},
properties={
"Name": {"title": [{"text": {"content": title}}]},
"Type": {"select": {"name": doc_type}},
"Status": {"select": {"name": "Draft"}},
},
children=[
{
"object": "block",
"type": "paragraph",
"paragraph": {
"rich_text": [{"type": "text", "text": {"content": content[:2000]}}]
}
}
]
)
return response["url"]
if __name__ == '__main__':
from dotenv import load_dotenv
load_dotenv()
url = push_to_notion(
title="소셜 로그인 기능 PRD",
content="## Problem Statement\n비밀번호 분실 문의가 전체 CS의 30%를 차지...",
doc_type="PRD"
)
print(f"Notion 페이지 생성 완료: {url}")

Slack notification integration

When work is completed, the results are delivered to the person in charge via Slack.

# sinks/slack_sink.py
import os
from slack_sdk import WebClient
slack = WebClient(token=os.environ["SLACK_BOT_TOKEN"])
CHANNEL = os.environ["SLACK_CHANNEL_ID"]
def notify_slack(summary: str, doc_url: str = None) -> None:
blocks = [
{
"type": "section",
"text": {"type": "mrkdwn", "text": f"🤖 *PM 에이전트 작업 완료*\n\n{summary[:500]}"}
}
]
if doc_url:
blocks.append({
"type": "actions",
"elements": [{
"type": "button",
"text": {"type": "plain_text", "text": "Notion에서 보기"},
"url": doc_url
}]
})
slack.chat_postMessage(channel=CHANNEL, blocks=blocks)

Tips for using Claude Code

Claude Code can be used as an agent framework rather than a simple code editor.

Batch operations with CLI

# git log 기반 릴리스 노트 자동 생성
claude --print "다음 git log를 기반으로 릴리스 노트를 작성해줘: $(git log --oneline -20)"
# 파일 읽어서 PRD 작성
claude --print "$(cat meeting_notes.txt)를 바탕으로 PRD 초안을 작성해줘"

Context persistence with CLAUDE.md

If you need to maintain context between sessions, you can save the project background in CLAUDE.md. You can start working right away without repeating explanations every time.

# CLAUDE.md 예시
## 제품 개요
- 서비스명: [서비스명]
- 타겟 사용자: 20-30대 직장인
- 핵심 가치: 업무 효율화
## PM 용어 정의
- Sprint: 2주 단위 개발 사이클
- PRD: Product Requirements Document
- NSM: North Star Metric — 월간 활성 사용자 수
## 현재 분기 OKR
- O: 결제 전환율 개선
- KR1: 결제 완료율 72% → 80%
- KR2: 결제 페이지 이탈률 28% → 18%

Call Claude Code from pipeline

# agents/claude_code_agent.py
import subprocess
def run_claude_code(prompt: str) -> str:
"""Claude Code CLI를 서브프로세스로 호출한다."""
result = subprocess.run(
['claude', '--print', prompt],
capture_output=True,
text=True,
cwd='/path/to/project'  # CLAUDE.md가 있는 프로젝트 루트
)
return result.stdout.strip()

When calling Claude Code as a subprocess, you must use the `--print` flag to run it in non-interactive mode and specify `cwd` as the project root where CLAUDE.md is located to ensure that the context is loaded correctly.

Example of full pipeline execution

# main.py
import os
from dotenv import load_dotenv
from agents.meeting_summarizer import summarize_meeting
from agents.prd_drafter import draft_prd
from agents.issue_prioritizer import prioritize_issues
from agents.release_note_generator import generate_release_notes
from sinks.notion_sink import push_to_notion
from sinks.slack_sink import notify_slack
load_dotenv()
def run_meeting_pipeline(transcript: str):
print("📝 회의록 요약 중...")
summary = summarize_meeting(transcript)
print("📤 Notion에 저장 중...")
url = push_to_notion(title="회의록 요약", content=summary, doc_type="Meeting Note")
print("💬 Slack 알림 전송...")
notify_slack(summary=summary[:300], doc_url=url)
print(f"✅ 완료: {url}")
return url
def run_sprint_pipeline(issues: list[str], git_log_count: int = 20):
print("🎯 이슈 우선순위 분류 중...")
priorities = prioritize_issues(issues)
print("📋 릴리스 노트 생성 중...")
release_notes = generate_release_notes()
content = f"## 이슈 우선순위\n{priorities}\n\n## 릴리스 노트\n{release_notes}"
url = push_to_notion(title="스프린트 리뷰", content=content, doc_type="Sprint Review")
notify_slack(summary="스프린트 리뷰 문서가 생성되었습니다.", doc_url=url)
return url
if __name__ == '__main__':
# 사용 예시
with open('today_meeting.txt') as f:
transcript = f.read()
run_meeting_pipeline(transcript)

Cost comparison

This is an estimate based on 200 PM document automation tasks per month.

method	Monthly Cost	Remarks
Full use of GPT-4o API	$40~80	Based on input/output tokens
Full use of Claude API	$30~60	by claude-opus
Full use of local LLM	$0	Excluding electricity bill
Local + Claude Mix	$5~15	Only complex tasks Claude
Handling 80% of simple repetitive tasks with a local LLM can reduce API costs by 70-80% while maintaining the quality of Claude for complex tasks.

Conclusion

The core of the PM agent design is the separation structure that handles daily repetitive tasks with local LLM and tasks requiring complex judgment with Claude Code. At first, you can save 30 minutes a day by automating just one summary of meeting minutes. Adding issue classification, PRD drafts, and release notes to the mix increases the amount of time that PMs can spend on things they should actually focus on, such as interviewing users, establishing strategies, and coordinating teams.