Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

Qwen3.5-27B를 Claude 4.6 Opus의 추론 패턴으로 증류한 모델. v2에서는 효율성을 크게 개선했다.

v2 업데이트 핵심

지표	성과
HumanEval 정확도	96.91% pass@1 (base 모델과 동일)
CoT 길이	~24% 감소
토큰당 정답률	+31.6% 향상
단점	HumanEval+ -1.24%, MMLU-Pro -7.2%

설계 철학

v2의 목표는 모델이 “더 많이 생각하게” 하는 게 아니라 “더 경제적으로 생각하게” 하는 것:

불필요하게 긴 내부 추론 체인 감소
쉬운 문제에서 과도한 분석 회피
추론-비용 대비 품질 비율 개선

학습된 추론 스캐폴드 예시

Let me analyze this request carefully:
1. Identify the core objective of the problem.
2. Break the task into clearly defined subcomponents.
3. Evaluate constraints and edge cases.
4. Formulate a step-by-step solution plan.
5. Execute the reasoning sequentially and verify consistency.

학습 파이프라인

Base Model (Qwen3.5-27B)
        │
        ▼
Qwen3.5-27B fine-tuned with Unsloth
        │
        ▼
Supervised Fine-Tuning (SFT) + LoRA
(Response-Only Training masked on "<|im_start|>assistant\n<think/>")
        │
        ▼
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

사용 데이터셋

데이터셋	용도
nohurry/Opus-4.6-Reasoning-3000x-filtered	Claude 4.6 Opus 추론 궤적
Roman1111111/claude-opus-4.6-10000x	대규모 일반 추론 증류
TeichAI/claude-4.5-opus-high-reasoning-250x	고강도 구조화 추론
Jackrong/Qwen3.5-reasoning-700x	단계별 문제해결 강화

한계 및 용도

할루시네이션 위험: 여전히 autoregressive LLM
권장 용도: 오프라인 분석, 코딩, 수학, 논리 의존 프롬프팅
용도 제한: 학습 및 데모 목적, 학술 연구 및 기술 탐색 전용

GGUF 버전

이 링크는 GGUF 양자화 버전으로, 로컬 실행에 적합하다.

qwen-models
claude-opus
model-distillation
2026-03-29-minimax-glm-kimi-coding-comparison — 2026년 3월 최신 코딩 모델 3종 비교
Source: https://huggingface.co/Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2-GGUF

Context Vault

탐색기

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

v2 업데이트 핵심

설계 철학

학습된 추론 스캐폴드 예시

학습 파이프라인

사용 데이터셋

한계 및 용도

GGUF 버전

그래프 뷰

목차

Context Vault

탐색기

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-v2

v2 업데이트 핵심

설계 철학

학습된 추론 스캐폴드 예시

학습 파이프라인

사용 데이터셋

한계 및 용도

GGUF 버전

Related

그래프 뷰

목차