大模型笔记
Home
0.inbox
[Read]2025 04
[Read]2025 09
[Read]2026 01
[Read]2026 02
Alg interview faq
1.[基建]数据
Data
3.[基建]效率
Env
Inference
Train
4.[模型]文本
CodeLLM
Embedding
PostTraining
PreTraining
5.[模型]多模态
MultiModalEmbedding
T2V
VLA
VLM
6.[模型]评测
Benchmark
LMM Benchmark
Metric
7.[应用]产品
Agent
Context
Product
VibeCoding
大模型笔记
3.[基建]效率
Train
训练框架
#
deepspeedai/DeepSpeed
unslothai/unsloth
Finetune框架
强化学习训练框架
#
[54.7k]
hiyouga/LLaMA-Factory
[14.7k]
huggingface/trl
[11.3k]
volcengine/verl
字节
[8.8k]
modelscope/ms-swift
[7.4k]
OpenRLHF/OpenRLHF
[1.5k]
alibaba/ROLL
分布式训练
#
[2024.10]
Liger Kernel: Efficient Triton Kernels for LLM Training
linkedin/Liger-Kernel
[2023.04]
PyTorch FSDP: Experiences on Scaling Fully Sharded Data Parallel
[2019.10]
ZeRO: Memory Optimizations Toward Training Trillion Parameter Models
[LLM]大模型显存计算公式与优化
« Previous
Next »