--- license: mit tags: - recommendation-system - collaborative-filtering - matrix-factorization - movie-recommendations - movielens - machine-learning library_name: scikit-learn --- # DataSynthis_ML_JobTask A powerful movie recommendation system using collaborative filtering and matrix factorization techniques on the MovieLens 100k dataset. ## Model Description This model provides personalized movie recommendations using two state-of-the-art algorithms: - **Collaborative Filtering (CF)**: Item-based similarity using cosine similarity - **Matrix Factorization (SVD)**: Singular Value Decomposition for dimensionality reduction ## Dataset - **MovieLens 100k**: 100,000 ratings from 943 users on 1,682 movies - **User ID Range**: 1-943 - **Movie Count**: 1,682 unique movies - **Rating Scale**: 1-5 stars ## Usage ### Python ```python from model import predict # Get recommendations using SVD (default) recommendations = predict(user_id=1, n_recommendations=10, method="svd") # Get recommendations using collaborative filtering recommendations = predict(user_id=1, n_recommendations=10, method="cf") print(recommendations) ``` ### Parameters - **user_id** (int): User ID between 1-943 (required) - **n_recommendations** (int): Number of recommendations between 1-20 (default: 10) - **method** (str): "svd" for matrix factorization or "cf" for collaborative filtering (default: "svd") ### Output Returns a list of dictionaries with movie recommendations: ```json [ { "movie_id": 50, "title": "Star Wars (1977)", "predicted_rating": 4.5 }, { "movie_id": 181, "title": "Return of the Jedi (1983)", "predicted_rating": 4.3 } ] ``` ## Model Performance - **SVD Method**: Fast predictions with good accuracy using 20 components - **Collaborative Filtering**: More interpretable, based on item similarity - **Cold Start Handling**: Graceful error handling for unknown users ## Technical Details - **Framework**: Scikit-learn - **Algorithms**: TruncatedSVD, Cosine Similarity - **Data Processing**: Pandas for efficient matrix operations - **Memory Efficient**: Optimized for large-scale recommendation tasks ## Installation ```bash pip install pandas numpy scikit-learn ``` ## Training The model is pre-trained on the MovieLens 100k dataset. To retrain: ```python from model import MovieRecommender model = MovieRecommender() model.load_data() model.train() model.save_model("movie_recommender.pkl") ``` ## Citation ```bibtex @misc{datasynthis_ml_jobtask, title={DataSynthis ML JobTask: Movie Recommendation System}, author={tasdid25}, year={2025}, url={https://huggingface.co/tasdid25/DataSynthis_ML_JobTask} } ``` ## License MIT License - see LICENSE file for details.