Islamic-FinClimateBERT: Fine-Tuned ClimateBERT for Islamic Finance Climate Discourse
A domain-adapted binary classifier fine-tuned on climate-related vs. non-climate sentences from Islamic finance corpora. This model is based on ClimateBERT and is specialized for detecting climate relevance in Islamic financial narratives.
Model Summary
- Base model:
ClimateBERT - Architecture: RoBERTa-based, distilled
- Task: Binary sentence classification
- Domain: Islamic Finance + Climate Discourse
- Labels:
0→ Not Climate-Relevant1→ Climate-Relevant
- Language: English (Islamic finance-specific vocabulary)
- Training Data Size: 1,132 manually annotated sentences
Training Pipeline
- Framework: Hugging Face
transformers+datasets - Tokenizer: ClimateBERT tokenizer (BPE)
- Training split: Stratified 80/20 (train/test)
- Evaluation metric: F1 (macro), accuracy
- Optimizer: AdamW with weight decay
- Epochs: 4
- Batch size: 16
- Precision: FP16 enabled
Evaluation
| Metric | Value |
|---|---|
| Accuracy | 0.9868 |
| F1-score | 0.9868 |
| Eval loss | 0.0553 |
Evaluation & Domain Comparison
The Islamic-FinClimateBERT model was evaluated against the original ClimateBERT using 79,876 sentence-level samples extracted from 838 annual reports of 103 Islamic banks across 25 jurisdictions (2015–2024).
This comparative evaluation assesses how domain fine-tuning affects climate relevance detection within Islamic finance discourse.
Evaluation Summary
| Metric | Fine-Tuned | Original | Description |
|---|---|---|---|
| Total Sentences | 79,876 | – | Sentences compared 1-to-1 |
| Agreements | 70,209 | – | Sentences where both models agreed |
| Disagreements | 9,667 | – | Sentences with differing predictions |
| Overall Accuracy | 0.88 | – | Agreement between models |
Classification Report (Fine-Tuned vs. Original)
| Label | Precision | Recall | F1-score | Support |
|---|---|---|---|---|
| Climate | 0.92 | 0.83 | 0.87 | 39,558 |
| Non-Climate | 0.85 | 0.93 | 0.89 | 40,318 |
| Overall Accuracy | – | – | 0.88 | 79,876 |
| Macro Avg | 0.88 | 0.88 | 0.88 | – |
Confusion Matrix
| Fine = Climate | Fine = Non-Climate | |
|---|---|---|
| Orig = Climate | 32,887 | 6,671 |
| Orig = Non-Climate | 2,996 | 37,322 |
- The fine-tuned model shows strong domain adaptation, improving contextual sensitivity to Islamic finance climate narratives.
- It tends to classify fewer sentences as “climate-relevant” compared to the base model, reflecting a more conservative and context-aware understanding of climate-related terminology in Islamic finance reporting.
GitHub Repository
The full project repository, including training notebooks, dataset scripts, and evaluation pipelines, is available at https://github.com/bilalezafar/Islamic-FinClimateBERT.
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("bilalzafar/Islamic-FinClimateBERT")
model = AutoModelForSequenceClassification.from_pretrained("bilalzafar/Islamic-FinClimateBERT")
# Define classifier function
def clf(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
label = probs.argmax().item()
score = probs.max().item()
return [{"label": "Climate" if label == 1 else "Not Climate", "score": round(score, 4)}]
# Example usage
text = "The bank’s green sukuk issuance aims to support renewable energy projects in the country."
print(clf(text)[0])
# Example output: {'label': 'Climate', 'score': 0.9995}
- Downloads last month
- 5