| --- |
| license: apache-2.0 |
| datasets: |
| - CodeTed/CGEDit_dataset |
| language: |
| - zh |
| metrics: |
| - accuracy |
| library_name: transformers |
| tags: |
| - CGED |
| - CSC |
| pipeline_tag: text2text-generation |
| --- |
| # CGEDit - Chinese Grammatical Error Diagnosis by Task-Specific Instruction Tuning |
|
|
| Try the model from this space "[Chinese Grammarly](https://huggingface.co/spaces/CodeTed/Chinese-Grammarly)". |
|
|
| This model was obtained by fine-tuning the corresponding `ClueAI/PromptCLUE-base-v1-5` model on the CoEdIT dataset. |
|  |
|
|
|
|
| ## Model Details |
| ### Model Description |
| - Language(s) (NLP): `Chinese` |
| - Finetuned from model: `ClueAI/PromptCLUE-base-v1-5` |
| ### Model Sources |
| - Repository: [https://github.com/TedYeh/Chinese_spelling_Correction](https://github.com/TedYeh/Chinese_spelling_Correction) |
|
|
| ## Usage |
| ```python |
| from transformers import AutoTokenizer, T5ForConditionalGeneration |
| |
| tokenizer = AutoTokenizer.from_pretrained("CodeTed/Chinese_Grammarly") |
| model = T5ForConditionalGeneration.from_pretrained("CodeTed/Chinese_Grammarly") |
| input_text = '糾正句子裡的錯字: 看完那段文張,我是反對的!' |
| input_ids = tokenizer(input_text, return_tensors="pt").input_ids |
| outputs = model.generate(input_ids, max_length=256) |
| edited_text = tokenizer.decode(outputs[0], skip_special_tokens=True) |
| ``` |