Spaces:

BinKhoaLe1812
/

WhisperAPI

Running on Zero

App Files Files Community

LiamKhoaLe commited on Oct 29, 2025

Commit

b9ed1c8

0 Parent(s):

Initial commit

Browse files

Files changed (8) hide show

.gitattributes +35 -0
DEPLOYMENT.md +151 -0
README.md +213 -0
app.py +140 -0
client.html +460 -0
client.py +119 -0
client_requirements.txt +2 -0
requirements.txt +11 -0

.gitattributes ADDED Viewed

	@@ -0,0 +1,35 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text

DEPLOYMENT.md ADDED Viewed

	@@ -0,0 +1,151 @@

+# Deployment Guide
+## 🚀 Deploying to Hugging Face Spaces
+### Step 1: Create a New Space
+1. Go to [Hugging Face Spaces](https://huggingface.co/spaces)
+2. Click "Create new Space"
+3. Fill in the details:
+   - **Space name**: `whisper-api` (or your preferred name)
+   - **License**: MIT
+   - **SDK**: Docker
+   - **Hardware**: **ZeroGPU** (this is crucial!)
+   - **Visibility**: Public or Private
+### Step 2: Upload Files
+Upload these files to your Space:
+1. **app.py** - Main application file
+2. **requirements.txt** - Python dependencies
+3. **README.md** - Space documentation (optional)
+### Step 3: Configure Space Settings
+In your Space settings, ensure:
+- **Hardware**: ZeroGPU is selected
+- **Environment variables**: None required
+- **Secrets**: None required
+### Step 4: Deploy
+The Space will automatically build and deploy. This process may take 5-10 minutes.
+### Step 5: Test Your Deployment
+Once deployed, test your API:
+```bash
+# Health check
+curl https://your-username-whisper-api.hf.space/health
+# Test transcription (replace with your actual endpoint)
+curl -X POST "https://your-username-whisper-api.hf.space/transcribe" \
+  -F "file=@test_audio.mp3"
+```
+## 🔧 Local Testing
+### Test the Web Client
+1. Open `client.html` in your browser
+2. Update the endpoint URL to your Space URL
+3. Upload an audio file and test transcription
+### Test the Python Client
+```bash
+# Install dependencies
+pip install -r client_requirements.txt
+# Test health
+python client.py --health --endpoint https://your-username-whisper-api.hf.space
+# Test transcription
+python client.py audio.mp3 --endpoint https://your-username-whisper-api.hf.space
+```
+### Run the Test Suite
+```bash
+# Install additional dependencies for testing
+pip install numpy soundfile
+# Run comprehensive tests
+python test_api.py
+```
+## 🐛 Troubleshooting
+### Common Issues
+1. **Space fails to build**
+   - Check that all required files are uploaded
+   - Verify `requirements.txt` has correct dependencies
+   - Check Space logs for error messages
+2. **Model loading errors**
+   - Ensure ZeroGPU is enabled
+   - Check that the model ID is correct
+   - Verify internet connectivity for model download
+3. **API not responding**
+   - Check Space status (should be "Running")
+   - Verify the Space URL is correct
+   - Check Space logs for runtime errors
+4. **CORS errors**
+   - The app is configured for all origins (`*`)
+   - If you need specific origins, modify the CORS settings in `app.py`
+### Monitoring Your Space
+- **Logs**: Check the "Logs" tab in your Space
+- **Metrics**: Monitor CPU/GPU usage in the "Metrics" tab
+- **Health**: Use the `/health` endpoint to check API status
+## 📊 Performance Optimization
+### For Better Performance
+1. **Use ZeroGPU**: Essential for fast inference
+2. **Optimize file sizes**: Smaller files process faster
+3. **Batch processing**: Use the batch_size parameter for multiple files
+4. **Model caching**: The model stays loaded between requests
+### Resource Limits
+- **ZeroGPU**: Limited GPU time per month
+- **Memory**: ~2GB RAM usage
+- **Storage**: Model files (~1.6GB)
+- **Timeout**: 5-minute request timeout
+## 🔄 Updates and Maintenance
+### Updating Your Space
+1. Edit files in your Space repository
+2. Commit changes
+3. The Space will automatically rebuild
+### Monitoring Usage
+- Check your ZeroGPU usage in your HF account
+- Monitor API calls and performance
+- Set up alerts if needed
+## 📈 Scaling Considerations
+### For Production Use
+1. **Rate limiting**: Consider implementing rate limiting
+2. **Authentication**: Add API keys for production use
+3. **Monitoring**: Set up proper logging and monitoring
+4. **Backup**: Keep local copies of your code
+### Alternative Deployments
+- **Self-hosted**: Deploy on your own infrastructure
+- **Cloud providers**: Use AWS, GCP, or Azure
+- **Docker**: Use the Dockerfile for containerized deployment

README.md ADDED Viewed

	@@ -0,0 +1,213 @@

+---
+title: Whisper Large V3 Turbo API
+emoji: 🎤
+colorFrom: blue
+colorTo: purple
+sdk: docker
+pinned: false
+license: mit
+short_description: Whisper LV3Turbo for ASR -API
+---
+# Whisper API Project
+## Features
+- **Hugging Face Space Deployment**: Deploy Whisper Large V3 Turbo with ZeroGPU
+- **Fast API**: RESTful API endpoints for transcription
+- **CORS Support**: Works with any frontend application
+- **Web Interface**: Beautiful Gradio interface for easy testing
+- **Local Clients**: Both web and Python clients for integration
+- **Multiple Formats**: Supports audio and video files
+## Project Structure
+```
+WhisperAPI/
+├── app.py                    # Main Hugging Face Space application
+├── requirements.txt          # Python dependencies for HF Space
+├── README.md                 # HF Space documentation
+├── client.html              # Web-based client interface
+├── client.py                # Python command-line client
+├── client_requirements.txt  # Python client dependencies
+└── README.md                # This file
+```
+## Setup Instructions
+### 1. Deploy to Hugging Face Spaces
+1. Create a new Space on [Hugging Face](https://huggingface.co/spaces)
+2. Choose "Docker" as the SDK
+3. Upload the following files to your Space:
+   - `app.py`
+   - `requirements.txt`
+   - `README.md` (for the Space)
+4. Set the Space to use **ZeroGPU** hardware
+5. The Space will automatically build and deploy
+### 2. Local Client Setup
+#### Web Client
+1. Open `client.html` in your web browser
+2. Enter your Hugging Face Space URL (default: `https://binkhoale1812-whisperapi.hf.space`)
+3. Upload audio/video files and get transcriptions
+#### Python Client
+1. Install dependencies:
+   ```bash
+   pip install -r client_requirements.txt
+   ```
+2. Use the command-line client:
+   ```bash
+   # Transcribe a file
+   python client.py audio.mp3
+   # Check API health
+   python client.py --health
+   # Save transcription to file
+   python client.py audio.mp3 --output transcription.txt
+   # Use custom endpoint
+   python client.py audio.mp3 --endpoint https://your-space.hf.space
+   ```
+## API Endpoints
+### POST /transcribe
+Transcribe an audio/video file.
+**Request:**
+- Method: POST
+- Content-Type: multipart/form-data
+- Body: File upload
+**Response:**
+```json
+{
+  "text": "Transcribed text here...",
+  "success": true
+}
+```
+### GET /health
+Check API health status.
+**Response:**
+```json
+{
+  "status": "healthy",
+  "model_loaded": true
+}
+```
+## Usage Examples
+### cURL Example
+```bash
+curl -X POST "https://your-space.hf.space/transcribe" \
+  -F "file=@audio.mp3"
+```
+### Python Example
+```python
+import requests
+# Transcribe audio file
+with open('audio.mp3', 'rb') as f:
+    files = {'file': ('audio.mp3', f, 'audio/mpeg')}
+    response = requests.post('https://your-space.hf.space/transcribe', files=files)
+    result = response.json()
+    if result['success']:
+        print(result['text'])
+    else:
+        print(f"Error: {result['error']}")
+```
+### JavaScript Example
+```javascript
+const formData = new FormData();
+formData.append('file', audioFile);
+fetch('https://your-space.hf.space/transcribe', {
+    method: 'POST',
+    body: formData
+})
+.then(response => response.json())
+.then(result => {
+    if (result.success) {
+        console.log(result.text);
+    } else {
+        console.error(result.error);
+    }
+});
+```
+## Supported File Formats
+### Audio Formats
+- MP3
+- WAV
+- FLAC
+- M4A
+- OGG
+### Video Formats
+- MP4
+- AVI
+- MOV
+- MKV
+*Note: For video files, only the audio track will be processed.*
+## Performance Notes
+- **Model**: Whisper Large V3 Turbo (809M parameters)
+- **Speed**: ~4x faster than standard Large V3
+- **GPU**: Uses ZeroGPU for efficient inference
+- **Languages**: Supports 99 languages
+- **Accuracy**: Maintains high accuracy despite speed optimizations
+## Security & CORS
+The API is configured with CORS enabled for all origins (`*`). This allows any frontend application to make requests to the API. For production use, consider restricting CORS origins to specific domains.
+## Troubleshooting
+### Common Issues
+1. **Model Loading Time**: First request may take longer as the model loads
+2. **File Size Limits**: Large files may timeout (5-minute limit)
+3. **Format Support**: Ensure your file format is supported
+4. **Network Issues**: Check your internet connection and API endpoint
+### Error Messages
+- `"Model not loaded"`: Wait a moment and try again
+- `"File not found"`: Check file path and permissions
+- `"Transcription failed"`: File may be corrupted or unsupported format
+- `"Network error"`: Check internet connection and endpoint URL
+## Monitoring
+Use the `/health` endpoint to monitor your API:
+```bash
+curl https://your-space.hf.space/health
+```
+## Contributing
+Feel free to submit issues and enhancement requests!
+## License
+This project is licensed under the MIT License.
+## Acknowledgments
+- OpenAI for the Whisper model
+- Hugging Face for the Spaces platform
+- The open-source community for various libraries used

app.py ADDED Viewed

	@@ -0,0 +1,140 @@

+import gradio as gr
+import spaces
+import torch
+import tempfile
+import os
+from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor, pipeline
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+import uvicorn
+# Initialize the model and processor globally
+model_id = "openai/whisper-large-v3-turbo"
+model = None
+processor = None
+pipe = None
+@spaces.GPU
+def load_model():
+    """Load the Whisper model on GPU"""
+    global model, processor, pipe
+    device = "cuda:0" if torch.cuda.is_available() else "cpu"
+    torch_dtype = torch.float16 if torch.cuda.is_available() else torch.float32
+    print(f"Loading model on device: {device}")
+    model = AutoModelForSpeechSeq2Seq.from_pretrained(
+        model_id,
+        torch_dtype=torch_dtype,
+        low_cpu_mem_usage=True,
+        use_safetensors=True
+    )
+    model.to(device)
+    processor = AutoProcessor.from_pretrained(model_id)
+    pipe = pipeline(
+        "automatic-speech-recognition",
+        model=model,
+        tokenizer=processor.tokenizer,
+        feature_extractor=processor.feature_extractor,
+        torch_dtype=torch_dtype,
+        device=device,
+    )
+    print("Model loaded successfully!")
+    return True
+@spaces.GPU
+def transcribe_audio(audio_file):
+    """Transcribe audio file using Whisper"""
+    global pipe
+    if pipe is None:
+        return {"error": "Model not loaded. Please wait and try again."}
+    try:
+        # Handle different audio file formats
+        if isinstance(audio_file, str):
+            # File path
+            result = pipe(audio_file)
+        else:
+            # File object
+            result = pipe(audio_file.name)
+        return {
+            "text": result["text"],
+            "success": True
+        }
+    except Exception as e:
+        return {
+            "error": f"Transcription failed: {str(e)}",
+            "success": False
+        }
+# Load model on startup
+load_model()
+# Create Gradio interface
+def gradio_transcribe(audio_file):
+    """Gradio interface for transcription"""
+    if audio_file is None:
+        return "Please upload an audio file."
+    result = transcribe_audio(audio_file)
+    if result.get("success"):
+        return result["text"]
+    else:
+        return f"Error: {result.get('error', 'Unknown error')}"
+# Create the Gradio interface
+demo = gr.Interface(
+    fn=gradio_transcribe,
+    inputs=gr.Audio(
+        sources=["upload", "microphone"],
+        type="filepath",
+        label="Upload Audio File or Record"
+    ),
+    outputs=gr.Textbox(
+        label="Transcription",
+        lines=10,
+        placeholder="Transcribed text will appear here..."
+    ),
+    title="Whisper Large V3 Turbo - Speech Recognition",
+    description="Upload an audio file or record audio to get transcription using OpenAI's Whisper model.",
+    allow_flagging="never"
+)
+# Create FastAPI app for API endpoints
+app = FastAPI(title="Whisper API", description="API for Whisper speech recognition")
+# Add CORS middleware
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],  # Allow all origins
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+@app.post("/transcribe")
+async def api_transcribe(file: gr.File):
+    """API endpoint for transcription"""
+    if file is None:
+        return {"error": "No file provided", "success": False}
+    result = transcribe_audio(file)
+    return result
+@app.get("/health")
+async def health_check():
+    """Health check endpoint"""
+    return {"status": "healthy", "model_loaded": pipe is not None}
+# Mount Gradio app
+app = gr.mount_gradio_app(app, demo, path="/")
+if __name__ == "__main__":
+    uvicorn.run(app, host="0.0.0.0", port=7860)

client.html ADDED Viewed

	@@ -0,0 +1,460 @@

+<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Whisper API Client</title>
+    <style>
+        * {
+            margin: 0;
+            padding: 0;
+            box-sizing: border-box;
+        }
+        body {
+            font-family: 'Segoe UI', Tahoma, Geneva, Verdana, sans-serif;
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            min-height: 100vh;
+            display: flex;
+            align-items: center;
+            justify-content: center;
+            padding: 20px;
+        }
+        .container {
+            background: white;
+            border-radius: 20px;
+            box-shadow: 0 20px 40px rgba(0, 0, 0, 0.1);
+            padding: 40px;
+            max-width: 600px;
+            width: 100%;
+        }
+        .header {
+            text-align: center;
+            margin-bottom: 30px;
+        }
+        .header h1 {
+            color: #333;
+            font-size: 2.5em;
+            margin-bottom: 10px;
+        }
+        .header p {
+            color: #666;
+            font-size: 1.1em;
+        }
+        .upload-area {
+            border: 3px dashed #ddd;
+            border-radius: 15px;
+            padding: 40px;
+            text-align: center;
+            margin-bottom: 30px;
+            transition: all 0.3s ease;
+            cursor: pointer;
+        }
+        .upload-area:hover {
+            border-color: #667eea;
+            background-color: #f8f9ff;
+        }
+        .upload-area.dragover {
+            border-color: #667eea;
+            background-color: #f0f4ff;
+            transform: scale(1.02);
+        }
+        .upload-icon {
+            font-size: 3em;
+            color: #667eea;
+            margin-bottom: 20px;
+        }
+        .upload-text {
+            font-size: 1.2em;
+            color: #333;
+            margin-bottom: 10px;
+        }
+        .upload-subtext {
+            color: #666;
+            font-size: 0.9em;
+        }
+        .file-input {
+            display: none;
+        }
+        .file-info {
+            background: #f8f9fa;
+            border-radius: 10px;
+            padding: 20px;
+            margin-bottom: 20px;
+            display: none;
+        }
+        .file-info.show {
+            display: block;
+        }
+        .file-name {
+            font-weight: bold;
+            color: #333;
+            margin-bottom: 5px;
+        }
+        .file-size {
+            color: #666;
+            font-size: 0.9em;
+        }
+        .button-group {
+            display: flex;
+            gap: 15px;
+            margin-bottom: 30px;
+        }
+        .btn {
+            flex: 1;
+            padding: 15px 30px;
+            border: none;
+            border-radius: 10px;
+            font-size: 1.1em;
+            font-weight: bold;
+            cursor: pointer;
+            transition: all 0.3s ease;
+        }
+        .btn-primary {
+            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            color: white;
+        }
+        .btn-primary:hover {
+            transform: translateY(-2px);
+            box-shadow: 0 10px 20px rgba(102, 126, 234, 0.3);
+        }
+        .btn-primary:disabled {
+            background: #ccc;
+            cursor: not-allowed;
+            transform: none;
+            box-shadow: none;
+        }
+        .btn-secondary {
+            background: #f8f9fa;
+            color: #333;
+            border: 2px solid #ddd;
+        }
+        .btn-secondary:hover {
+            background: #e9ecef;
+            border-color: #adb5bd;
+        }
+        .result-area {
+            background: #f8f9fa;
+            border-radius: 15px;
+            padding: 25px;
+            min-height: 150px;
+            display: none;
+        }
+        .result-area.show {
+            display: block;
+        }
+        .result-title {
+            font-weight: bold;
+            color: #333;
+            margin-bottom: 15px;
+            font-size: 1.2em;
+        }
+        .result-text {
+            color: #555;
+            line-height: 1.6;
+            white-space: pre-wrap;
+            word-wrap: break-word;
+        }
+        .loading {
+            display: none;
+            text-align: center;
+            padding: 20px;
+        }
+        .loading.show {
+            display: block;
+        }
+        .spinner {
+            border: 4px solid #f3f3f3;
+            border-top: 4px solid #667eea;
+            border-radius: 50%;
+            width: 40px;
+            height: 40px;
+            animation: spin 1s linear infinite;
+            margin: 0 auto 15px;
+        }
+        @keyframes spin {
+            0% { transform: rotate(0deg); }
+            100% { transform: rotate(360deg); }
+        }
+        .error {
+            background: #f8d7da;
+            color: #721c24;
+            border: 1px solid #f5c6cb;
+            border-radius: 10px;
+            padding: 15px;
+            margin-top: 20px;
+            display: none;
+        }
+        .error.show {
+            display: block;
+        }
+        .endpoint-config {
+            background: #f8f9fa;
+            border-radius: 10px;
+            padding: 20px;
+            margin-bottom: 20px;
+        }
+        .endpoint-config label {
+            display: block;
+            font-weight: bold;
+            color: #333;
+            margin-bottom: 8px;
+        }
+        .endpoint-config input {
+            width: 100%;
+            padding: 12px;
+            border: 2px solid #ddd;
+            border-radius: 8px;
+            font-size: 1em;
+            transition: border-color 0.3s ease;
+        }
+        .endpoint-config input:focus {
+            outline: none;
+            border-color: #667eea;
+        }
+    </style>
+</head>
+<body>
+    <div class="container">
+        <div class="header">
+            <h1>🎤 Whisper API Client</h1>
+            <p>Upload audio/video files for transcription using OpenAI's Whisper Large V3 Turbo</p>
+        </div>
+        <div class="endpoint-config">
+            <label for="endpoint">API Endpoint:</label>
+            <input type="text" id="endpoint" value="https://binkhoale1812-whisperapi.hf.space" placeholder="Enter your Hugging Face Space URL">
+        </div>
+        <div class="upload-area" id="uploadArea">
+            <div class="upload-icon">📁</div>
+            <div class="upload-text">Click to upload or drag & drop</div>
+            <div class="upload-subtext">Supports MP3, WAV, FLAC, M4A, MP4, AVI, MOV files</div>
+            <input type="file" id="fileInput" class="file-input" accept="audio/*,video/*">
+        </div>
+        <div class="file-info" id="fileInfo">
+            <div class="file-name" id="fileName"></div>
+            <div class="file-size" id="fileSize"></div>
+        </div>
+        <div class="button-group">
+            <button class="btn btn-primary" id="transcribeBtn" disabled>Transcribe</button>
+            <button class="btn btn-secondary" id="clearBtn">Clear</button>
+        </div>
+        <div class="loading" id="loading">
+            <div class="spinner"></div>
+            <div>Processing your audio file...</div>
+        </div>
+        <div class="result-area" id="resultArea">
+            <div class="result-title">Transcription Result:</div>
+            <div class="result-text" id="resultText"></div>
+        </div>
+        <div class="error" id="errorArea">
+            <div id="errorText"></div>
+        </div>
+    </div>
+    <script>
+        const uploadArea = document.getElementById('uploadArea');
+        const fileInput = document.getElementById('fileInput');
+        const fileInfo = document.getElementById('fileInfo');
+        const fileName = document.getElementById('fileName');
+        const fileSize = document.getElementById('fileSize');
+        const transcribeBtn = document.getElementById('transcribeBtn');
+        const clearBtn = document.getElementById('clearBtn');
+        const loading = document.getElementById('loading');
+        const resultArea = document.getElementById('resultArea');
+        const resultText = document.getElementById('resultText');
+        const errorArea = document.getElementById('errorArea');
+        const errorText = document.getElementById('errorText');
+        const endpointInput = document.getElementById('endpoint');
+        let selectedFile = null;
+        // Upload area click handler
+        uploadArea.addEventListener('click', () => {
+            fileInput.click();
+        });
+        // File input change handler
+        fileInput.addEventListener('change', (e) => {
+            const file = e.target.files[0];
+            if (file) {
+                handleFileSelect(file);
+            }
+        });
+        // Drag and drop handlers
+        uploadArea.addEventListener('dragover', (e) => {
+            e.preventDefault();
+            uploadArea.classList.add('dragover');
+        });
+        uploadArea.addEventListener('dragleave', () => {
+            uploadArea.classList.remove('dragover');
+        });
+        uploadArea.addEventListener('drop', (e) => {
+            e.preventDefault();
+            uploadArea.classList.remove('dragover');
+            const file = e.dataTransfer.files[0];
+            if (file) {
+                handleFileSelect(file);
+            }
+        });
+        // File selection handler
+        function handleFileSelect(file) {
+            selectedFile = file;
+            fileName.textContent = file.name;
+            fileSize.textContent = formatFileSize(file.size);
+            fileInfo.classList.add('show');
+            transcribeBtn.disabled = false;
+            hideError();
+        }
+        // Format file size
+        function formatFileSize(bytes) {
+            if (bytes === 0) return '0 Bytes';
+            const k = 1024;
+            const sizes = ['Bytes', 'KB', 'MB', 'GB'];
+            const i = Math.floor(Math.log(bytes) / Math.log(k));
+            return parseFloat((bytes / Math.pow(k, i)).toFixed(2)) + ' ' + sizes[i];
+        }
+        // Transcribe button handler
+        transcribeBtn.addEventListener('click', async () => {
+            if (!selectedFile) return;
+            const endpoint = endpointInput.value.trim();
+            if (!endpoint) {
+                showError('Please enter a valid API endpoint');
+                return;
+            }
+            showLoading();
+            hideError();
+            hideResult();
+            try {
+                const formData = new FormData();
+                formData.append('file', selectedFile);
+                const response = await fetch(`${endpoint}/transcribe`, {
+                    method: 'POST',
+                    body: formData,
+                });
+                const result = await response.json();
+                hideLoading();
+                if (result.success) {
+                    showResult(result.text);
+                } else {
+                    showError(result.error || 'Transcription failed');
+                }
+            } catch (error) {
+                hideLoading();
+                showError(`Network error: ${error.message}`);
+            }
+        });
+        // Clear button handler
+        clearBtn.addEventListener('click', () => {
+            selectedFile = null;
+            fileInput.value = '';
+            fileInfo.classList.remove('show');
+            transcribeBtn.disabled = true;
+            hideResult();
+            hideError();
+        });
+        // Utility functions
+        function showLoading() {
+            loading.classList.add('show');
+            transcribeBtn.disabled = true;
+        }
+        function hideLoading() {
+            loading.classList.remove('show');
+            transcribeBtn.disabled = false;
+        }
+        function showResult(text) {
+            resultText.textContent = text;
+            resultArea.classList.add('show');
+        }
+        function hideResult() {
+            resultArea.classList.remove('show');
+        }
+        function showError(message) {
+            errorText.textContent = message;
+            errorArea.classList.add('show');
+        }
+        function hideError() {
+            errorArea.classList.remove('show');
+        }
+        // Test endpoint connectivity on page load
+        window.addEventListener('load', async () => {
+            const endpoint = endpointInput.value.trim();
+            if (endpoint) {
+                try {
+                    const response = await fetch(`${endpoint}/health`);
+                    if (response.ok) {
+                        console.log('API endpoint is accessible');
+                    } else {
+                        console.warn('API endpoint returned non-200 status');
+                    }
+                } catch (error) {
+                    console.warn('Could not reach API endpoint:', error.message);
+                }
+            }
+        });
+    </script>
+</body>
+</html>

client.py ADDED Viewed

	@@ -0,0 +1,119 @@

+#!/usr/bin/env python3
+"""
+Whisper API Client - Python script to interact with the Hugging Face Space API
+"""
+import requests
+import argparse
+import os
+import sys
+from pathlib import Path
+class WhisperAPIClient:
+    def __init__(self, endpoint="https://binkhoale1812-whisperapi.hf.space"):
+        self.endpoint = endpoint.rstrip('/')
+    def transcribe(self, file_path):
+        """
+        Transcribe an audio/video file using the Whisper API
+        Args:
+            file_path (str): Path to the audio/video file
+        Returns:
+            dict: Response containing transcription text or error
+        """
+        if not os.path.exists(file_path):
+            return {"error": f"File not found: {file_path}", "success": False}
+        try:
+            with open(file_path, 'rb') as f:
+                files = {'file': (os.path.basename(file_path), f, 'audio/mpeg')}
+                response = requests.post(
+                    f"{self.endpoint}/transcribe",
+                    files=files,
+                    timeout=300  # 5 minutes timeout for large files
+                )
+                if response.status_code == 200:
+                    return response.json()
+                else:
+                    return {
+                        "error": f"HTTP {response.status_code}: {response.text}",
+                        "success": False
+                    }
+        except requests.exceptions.RequestException as e:
+            return {"error": f"Request failed: {str(e)}", "success": False}
+        except Exception as e:
+            return {"error": f"Unexpected error: {str(e)}", "success": False}
+    def health_check(self):
+        """
+        Check if the API endpoint is healthy
+        Returns:
+            dict: Health status
+        """
+        try:
+            response = requests.get(f"{self.endpoint}/health", timeout=10)
+            if response.status_code == 200:
+                return response.json()
+            else:
+                return {"status": "unhealthy", "error": f"HTTP {response.status_code}"}
+        except requests.exceptions.RequestException as e:
+            return {"status": "unhealthy", "error": str(e)}
+def main():
+    parser = argparse.ArgumentParser(description="Whisper API Client")
+    parser.add_argument("file", nargs="?", help="Audio/video file to transcribe")
+    parser.add_argument("--endpoint", default="https://binkhoale1812-whisperapi.hf.space",
+                       help="API endpoint URL")
+    parser.add_argument("--health", action="store_true", help="Check API health")
+    parser.add_argument("--output", "-o", help="Output file for transcription")
+    args = parser.parse_args()
+    client = WhisperAPIClient(args.endpoint)
+    # Health check
+    if args.health:
+        print("Checking API health...")
+        health = client.health_check()
+        print(f"Status: {health.get('status', 'unknown')}")
+        if 'error' in health:
+            print(f"Error: {health['error']}")
+        return
+    # Transcribe file
+    if not args.file:
+        print("Error: Please provide a file to transcribe")
+        print("Usage: python client.py <file> [options]")
+        print("       python client.py --health")
+        sys.exit(1)
+    print(f"Transcribing: {args.file}")
+    print("This may take a while for large files...")
+    result = client.transcribe(args.file)
+    if result.get("success"):
+        transcription = result["text"]
+        print("\n" + "="*50)
+        print("TRANSCRIPTION RESULT:")
+        print("="*50)
+        print(transcription)
+        print("="*50)
+        # Save to file if requested
+        if args.output:
+            with open(args.output, 'w', encoding='utf-8') as f:
+                f.write(transcription)
+            print(f"\nTranscription saved to: {args.output}")
+    else:
+        print(f"Error: {result.get('error', 'Unknown error')}")
+        sys.exit(1)
+if __name__ == "__main__":
+    main()

client_requirements.txt ADDED Viewed

	@@ -0,0 +1,2 @@


1	+ requests>=2.31.0
2	+ pathlib2>=2.3.7

requirements.txt ADDED Viewed

	@@ -0,0 +1,11 @@

+gradio>=4.0.0
+transformers>=4.35.0
+torch>=2.0.0
+torchaudio>=2.0.0
+accelerate>=0.20.0
+datasets[audio]>=2.14.0
+fastapi>=0.100.0
+uvicorn>=0.20.0
+python-multipart>=0.0.6
+librosa>=0.10.0
+soundfile>=0.12.0