Final PPT Imdb

20
IMDB DATABASE BY: TEAM 7 Chintan Koticha(001267049) Payal Dodeja (001224158) Siddhant Chandiwal (001286480)

Transcript of Final PPT Imdb

Page 1: Final PPT Imdb

IMDB DATABASEBY: TEAM 7Chintan Koticha(001267049)Payal Dodeja (001224158)

Siddhant Chandiwal (001286480)

Page 2: Final PPT Imdb

INTRODUCTIONThe Internet Movie Database(IMDB) is an online database of information related to

movies,TV shows,celebrities,genre,reviews,etc.The IMDB website enables registered users to rate different movies,TVshows and actors on a scale of 1 to 10. It also enables users to search different movies or TV shows of different genres on a single platform.

Page 3: Final PPT Imdb

REQUIREMENTSPrime focus of this project is to provide a generic functional database of IMDB to access entertainment industry websites online.

For this purpose, we have gathered data from IMDB website and have performed different SQL queries keeping in mind the user’s perspective and expectations.

Page 4: Final PPT Imdb

ER MODELADDRESS(AddressId,StreetLine,ZipCode,CityID)AWARDCATEGORY(AwardId,AwardName,AwardTypeId)AWARDTYPE(AwardTypeId,AwardTypeName,Date,Description,Location)BOXOFFICE(BoxOfficeID, Budget,OpeningWeekend,Grossincome,MovieId)CELEBRITY(CelebrityId ,CelebrityName,DateOfBirth,PlaceOfBirth,ShortBiograpgy,Gender)CHANNEL(ChannelId,ChannelName)COMPANYCREDITS(CompanyCreditId,ProductionCompany,MovieID)DIRECTORS(DirectorID, DirectorName,ShortBiography,DateOfBirth,PlaceOfBirth)EPISODE(EpisodeID, EpisodeNumber,EpisodeName,Description,SeasonID)GENRE(GenreID, GenreName)MOVIE(MovieID,MovieName,MovieShortDescription,ReleaseDate,MovieDuration,TelevisionContentRatingSys

tem,WatchTrailerURL,MovieRating,MovieTotalVotes,ReviewID)POLL(PollID,PollName,PollDescription,URL,FeaturedPoll,HOtlyContestedPoll,PollType,ThumbnailImage,Directo

rId,CelebrityID,TvShowID,MovieID,UserID)USERACCOUNT(UserID,FirstName,LastName,Email,Password,CityID)

Imdb_Toad.txp

Page 5: Final PPT Imdb

SQL QUERIESMicrosoft Word

Document

Predicting Movie Rating and its Votes

Page 6: Final PPT Imdb

SQL QUERIESAnalysis of Movie Collection per state per country.docx

Total Income Generated from Movies from each State and each Country

Page 7: Final PPT Imdb

SQL QUERIESCreate a function to find out the co.docx

Function to find most Networked Celebrity

Page 8: Final PPT Imdb

SQL QUERIESAverage income of movies on the basis of genre

Microsoft Word Document

Page 9: Final PPT Imdb

ORACLE QUERIESAnalysis of Movie Collection per state per countr1.docx

Total Income Generated from Movies from each State and each Country

Page 10: Final PPT Imdb

ORACLE QUERIES

Procedure for TVShowOnTonight.docx

TV Show on Tonight

Page 11: Final PPT Imdb

HIVE DATABASE

List of all Tables.csv

List of all Tables generated in Hive

Page 12: Final PPT Imdb

QUERY IN HIVE

Microsoft Excel Workbook

Page 13: Final PPT Imdb

TABLEAU REPRESENTATION

Page 14: Final PPT Imdb

TABLEAU REPRESENTATION

Page 15: Final PPT Imdb

TABLEAU REPRESENTATION

Page 16: Final PPT Imdb

ANALYSISThe model depicts gross collection and rating of the upcoming movies and the basis of

popularity of celebrity, director and previous movie reviews.

Page 17: Final PPT Imdb

TABLEAU PRESENTATION

Page 18: Final PPT Imdb

OUR LEARNINGS THE PROCESS/DIFFERENCES NOTED IN BETWEEN DATABASES

TRANSACTION CONTROL:  A transaction can be defined as a group of operations or tasks that should be treated as a single unit. MS SQL Server will execute and commit each command/task individually, and it will be difficult or impossible to roll back changes if any errors are encountered along the way. To properly group statements, the “BEGIN TRANSACTION” command is used to declare the beginning of a transaction, and either a COMMIT statement is used at the end. In Oracle, each new database connection is treated as new transaction. As queries are executed and commands are issued, changes are made only in memory and nothing is committed until an explicit COMMIT statement is given.

ORGANIZATION OF DATABASES: MS SQL Server organizes all objects, such as tables, views, and procedures, by database names. Users are assigned to a login which is granted accesses to the specific database and its objects whereas in ORACLE, all the database objects are grouped by schemas, which are a subset collection of database objects and all the database objects are shared among all schemas and users.

REVERSE ENGG OF DATABASE: Tools like TOAD are used to reverse engineer a database in ORACLE while MSSQL has an inbuilt functionality for the same.

WORKING WITH TRIGGERS: MS SQL has a set based approach. Rows that are affected by a data modification (insert, update, delete) are stored in the inserted and deleted tables. In Oracle there are before and after triggers and a trigger can be defined to be executed per row or per statement. Oracle disallows access to other rows of the table as conceptually, the per row trigger is fired during the table modification for each row when the row is in the process of being modified.

WORKING WITH BLOB: ORACLE stores image URL’s as datatype BLOB and while retrieving it represents as (BLOB) whereas MSSQL datatype VARBINARY(MAX) and while retrieving data it represents as a set of binary characters.

SYNTACTICAL DIFFERENCES BETWEEN THE 2.

Page 19: Final PPT Imdb

CONCLUSION Created and design IMDB database by checking the current version of imdb.com and built

an ER model with respect to it

Translated the ER model to normalized tables

Populated the real time data, analysed and extracted raw data from the imdb website

After incorporation of data in the databases (MsSQL, Oracle, Hive), the next step was to design an analysis model like estimating the gross collection of the movie on the basis of previous movie rating, celebrities, etc.

Analysed few general user questions on Tableau Software which provided Amazing Data Visualization and helps in faster data interpretation

The applications of the model are many and are not limited to the above sample only.

Page 20: Final PPT Imdb