train_nanogpt_golf.jl

0

Описание

openai/parameter-golf tricks +NorMuon +FlashAttention via NNkernels.jl +Byte-level UTF-8 tokenizer (works for prereform Cyrillic)

Языки

  • Julia100%