Internship in Kyoto Univ Graduate School of Informatics - (nonconfidential)
- It is recommended for beginners to use PyCharm. You can download the professional version and try it free for 30 days: Link
- Register regular users through the web interface and create administrators using
createsuperuser
command. Detailed commands are provided below. - Import movie information using
insert_movies_script.py
(This will delete all existing information!) - Frontend displays include functions such as most browsed, highest rated, most favorited, etc. You can make them sound more indirect, like "hottest movies" or "popular rankings". Each category displays 10 items.
I guess you prefer user-based recommendation for your system, while item recommendation is based on projects. Both approaches are explained below.
Frontend: Bootstrap3 CSS framework Backend: Django 2.2.1 + SQLite3 database (MVC framework) Data: Python asynchronous crawler to fetch data from Douban Top 250 and save it to local CSV files Main Features: Inputting movie information, user ratings, movie tag classification, movie recommendation, movie sharing, movie favoriting, and backend management system. Adopts MVC architecture, frontend pages are implemented using Django template, facilitating template reuse. The organization of frontend pages is clear.
Calculate the distance between users through collaborative filtering and other users, then filter. If the number of users is insufficient and the number of recommendations is less than 15, automatically fill in a portion from all unrated movies in descending order of views.
- Users need to rate movies. Calculate similarity based on the rated portion by users. If the user has not rated or there are no other users, return in descending order of views.
- Use Pearson algorithm to calculate the distance between users, find the N nearest users, and return the rated movies among these users (and unseen by the user to be recommended).
- Calculate the similarity matrix of items.
- Traverse the items already rated by the current user, calculate the similarity distance with unrated items.
- Sort by similarity distance and return.
- Login/register page
- Movie classification, sorting, searching, rating, and sorting based on collaborative filtering.
- Weekly and monthly recommendations based on collaborative filtering.
- Features such as movie sharing events and user registration for activities (to be added separately)
- Forum functionality for posting messages (to be added separately)
- ALS algorithm based on Spark (to be added separately)
- MySQL adaptation
- Integration with MovieLens dataset
Recommendation Algorithm - Collaborative Filtering - JianShu What's the Difference Between Collaborative Filtering and Content-Based Recommendation? - Zhihu
- Incorrect homepage navigation links
- Empty homepage
- Login/register page
- Recommendation redirects to login
- Weekly recommendations randomly when users haven't rated
- Sorting by number of favorites
- Redesigned action and UserAction model, separating UserAction
- Views: Number of views each time the page is refreshed
- Favorites: Many-to-many field for users, each user can favorite once
- Rating: Each user rates once
- Like function for comments under movies
- Import the project into PyCharm and configure the Python interpreter (Python 3.7 or below). You can install via Conda or other virtual environments.
- Open the terminal and run
pip install -r requirements.txt
. If pip is not found, downloadget-pip.py
and runpython get-pip.py
. - During pip installation, if encountering C++ 14 dependency issues, install the C++ dependency tool. If you can't find it, ask me. If installation is slow, change to a mirror in China.
- Once installation is successful, proceed to the running phase.
- Run the server:
python manage.py runserver
- If there is no data, run the data migration script starting with "populate" in the project root directory.
- Create a superuser:
python manage.py createsuperuser
(Password input will not be visible in the terminal) - Access the admin panel: 127.0.0.1:8000/admin
For permanent updates and maintenance support, please contact me. For other issues, please contact me.
media/
: Directory for storing static files, such as images.movie/
: Default app in Django, responsible for settings configuration, URL routing, deployment, etc.static/
: Directory for storing CSS and JS files.user/
: Main app, most of the program's code resides here.user/migrations
contains auto-generated database migration files,user/templates
contains frontend template files,user/admins.py
contains admin backend code,user/forms.py
contains frontend form code,user/models.py
contains database ORM models,user/serializers.py
contains RESTful files (not relevant),user/urls
registers the routes, anduser/views
handles frontend requests and interacts with the backend database (i.e., controller module).cache_keys.py
: File for storing cache key names (ignore).db.sqlite3
: Database file.douban_crawler.py
: Douban crawler file.manage.py
: Main program for running, start from here.populate_movies_script.py
: Fills movie data into the database.populate_user_rate.py
: Randomly generates user ratings.