kurogane
/

phi3-pico-test00

Text Generation

Model card Files Files and versions

phi3-pico-test00 / README.md

kurogane's picture

Update README.md

c0fee43 verified 3 months ago

|

history blame contribute delete

1.22 kB

	---
	license: apache-2.0
	pipeline_tag: text-generation
	language:
	- ja
	- en
	datasets:
	- hotchpotch/fineweb-2-edu-japanese
	- HuggingFaceTB/smollm-corpus
	- HuggingFaceFW/finepdfs
	- OmniAICreator/WebNovels-Ja
	---

	## 概要
	Phi3アーキテクチャのモデルです。
	context sizeは256です。

	## dataset
	以下のデータセットで1epoch回しました。
	- [HuggingFaceTB/smollm-corpus](huggingface.co/datasets/HuggingFaceTB/smollm-corpus)
	- cosmopedia-v2: 10,000,000件
	- fineweb-edu-dedup: 10,000,000件
	- [hotchpotch/fineweb-2-edu-japanese](huggingface.co/datasets/hotchpotch/fineweb-2-edu-japanese)
	- sample_10BT: 15,000,000件
	- [HuggingFaceFW/finepdfs](huggingface.co/datasets/HuggingFaceFW/finepdfs)
	- jpn_Jpan: 10,000,000件
	- eng_Latn: 100,000件
	- [OmniAICreator/WebNovels-Ja](huggingface.co/datasets/OmniAICreator/WebNovels-Ja)
	- 2,560,871件

	バッチサイズ: 140
	ステップ数: 2094240
	トレーニングトークン合計数: 75B tokens
	学習率: 3e-4

	## tokenizer
	[Rakuten/RakutenAI-2.0-mini-instruct](Rakuten/RakutenAI-2.0-mini-instruct)を使用しました。
	日本語対応LLMであり、vocab_sizeが48000と学習に使いやすそうだったからです。」