Islamic-FinClimateBERT: Fine-Tuned ClimateBERT for Islamic Finance Climate Discourse

A domain-adapted binary classifier fine-tuned on climate-related vs. non-climate sentences from Islamic finance corpora. This model is based on ClimateBERT and is specialized for detecting climate relevance in Islamic financial narratives.

Model Summary

  • Base model: ClimateBERT
  • Architecture: RoBERTa-based, distilled
  • Task: Binary sentence classification
  • Domain: Islamic Finance + Climate Discourse
  • Labels:
    • 0 → Not Climate-Relevant
    • 1 → Climate-Relevant
  • Language: English (Islamic finance-specific vocabulary)
  • Training Data Size: 1,132 manually annotated sentences

Training Pipeline

  • Framework: Hugging Face transformers + datasets
  • Tokenizer: ClimateBERT tokenizer (BPE)
  • Training split: Stratified 80/20 (train/test)
  • Evaluation metric: F1 (macro), accuracy
  • Optimizer: AdamW with weight decay
  • Epochs: 4
  • Batch size: 16
  • Precision: FP16 enabled

Evaluation

Metric Value
Accuracy 0.9868
F1-score 0.9868
Eval loss 0.0553

Evaluation & Domain Comparison

The Islamic-FinClimateBERT model was evaluated against the original ClimateBERT using 79,876 sentence-level samples extracted from 838 annual reports of 103 Islamic banks across 25 jurisdictions (2015–2024).

This comparative evaluation assesses how domain fine-tuning affects climate relevance detection within Islamic finance discourse.

Evaluation Summary

Metric Fine-Tuned Original Description
Total Sentences 79,876 Sentences compared 1-to-1
Agreements 70,209 Sentences where both models agreed
Disagreements 9,667 Sentences with differing predictions
Overall Accuracy 0.88 Agreement between models

Classification Report (Fine-Tuned vs. Original)

Label Precision Recall F1-score Support
Climate 0.92 0.83 0.87 39,558
Non-Climate 0.85 0.93 0.89 40,318
Overall Accuracy 0.88 79,876
Macro Avg 0.88 0.88 0.88

Confusion Matrix

Fine = Climate Fine = Non-Climate
Orig = Climate 32,887 6,671
Orig = Non-Climate 2,996 37,322
  • The fine-tuned model shows strong domain adaptation, improving contextual sensitivity to Islamic finance climate narratives.
  • It tends to classify fewer sentences as “climate-relevant” compared to the base model, reflecting a more conservative and context-aware understanding of climate-related terminology in Islamic finance reporting.

GitHub Repository

The full project repository, including training notebooks, dataset scripts, and evaluation pipelines, is available at https://github.com/bilalezafar/Islamic-FinClimateBERT.


Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("bilalzafar/Islamic-FinClimateBERT")
model = AutoModelForSequenceClassification.from_pretrained("bilalzafar/Islamic-FinClimateBERT")

# Define classifier function
def clf(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)
    label = probs.argmax().item()
    score = probs.max().item()
    return [{"label": "Climate" if label == 1 else "Not Climate", "score": round(score, 4)}]

# Example usage
text = "The bank’s green sukuk issuance aims to support renewable energy projects in the country."
print(clf(text)[0])

# Example output: {'label': 'Climate', 'score': 0.9995}

Downloads last month
5
Safetensors
Model size
82.3M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bilalzafar/Islamic-FinClimateBERT

Finetuned
(5)
this model