Cinematic Insights: Uncovering IMDB Trends with SQL
Analyzed IMDB movie data using advanced SQL like CTEs, joins, subqueries, and window functions to uncover insights on genres, ratings, directors, actors, and production trends.
SQL’s Role in the Spotlight:
In this project, I utilized a variety of SQL concepts to extract insights and solve real-world data problems:
Aggregation & Filtering: Used
COUNT(),AVG(),SUM(), and conditional filtering withCASE WHENto summarize movie data.Subqueries & Common Table Expressions (CTEs): Implemented complex analysis like top-rated movies, top actors, and top production houses using WITH clauses for readability and modularity.
Joins: Combined multiple tables (movie, ratings, genre, names, director_mapping, role_mapping) using
INNER JOINto build relationships between datasets.Window Functions: Applied
RANK(),DENSE_RANK(), andROW_NUMBER()to rank movies, directors, and actors based on performance metrics.Data Cleaning: Checked for
NULLvalues and handled missing data to ensure reliable insights.Analytical Queries: Computed running totals, moving averages, and weighted averages to understand rating patterns and trends.
Project Objective:
The goal of this project was to dive deep into the IMDB movie dataset using SQL and uncover meaningful insights about movies, genres, directors, actors, and production houses. Through a series of analytical queries, I explored patterns in movie releases, ratings, and industry trends transforming raw data into a cinematic story of numbers and logic.
Key Insights:
- Movie Trends by Year & Country
- Movie production peaked in 2019, with the USA and India leading the charts.
- March emerged as the most active month for new releases.
- Genre Analysis
- Drama dominated as the most produced genre, followed by Thriller and Comedy.
- Average movie durations varied across genres Drama films averaged around 106 minutes.
- Thriller ranked among the top 3 genres in production volume.
- Ratings & Reviews Insights
- Movies with median ratings of 7 were the most common.
- Top 10 movies had average ratings above 9, highlighting exceptional storytelling.
- Top Performers
- James Mangold stood out as a director with multiple high-rated movies (>8 rating).
- Christian Bale led among actors, while Taapsee Pannu topped among Indian actresses.
- Vijay Sethupathi ranked as India’s best actor, based on weighted average ratings and total votes.
Business Insights
- Marvel Studios led global engagement with the highest number of audience votes.
- Weighted average analysis helped identify not just who acted in the most movies but who consistently appeared in high-performing ones.
- Genre-Specific Trends
- Thriller movies with 25K+ votes were categorized as Superhit, Hit, One-time-watch, or Flop based on average rating using CASE WHEN.
- Genre-wise running totals and moving averages revealed trends in average duration across categories.
Conclusion
This IMDB analysis was a journey through the world of movies told through SQL. From identifying blockbuster trends to evaluating industry leaders, the project demonstrates how structured data and analytical queries can narrate cinematic stories. The experience sharpened my ability to handle complex relational data, optimize SQL queries, and extract actionable insights that mirror real-world business intelligence problems.
Connect me on Threads Linkedin
Check out my resume