Final PPT Imdb (1)
-
Upload
payaldodeja -
Category
Documents
-
view
158 -
download
2
Transcript of Final PPT Imdb (1)
IMDB DATABASEBY: TEAM 7Chintan Koticha(001267049)Payal Dodeja (001224158)
Siddhant Chandiwal (001286480)
INTRODUCTIONThe Internet Movie Database(IMDB) is an online database of information related to
movies,TV shows,celebrities,genre,reviews,etc.The IMDB website enables registered users to rate different movies,TVshows and actors on a scale of 1 to 10. It also enables users to search different movies or TV shows of different genres on a single platform.
REQUIREMENTSPrime focus of this project is to provide a generic functional database of IMDB to access entertainment industry websites online.
For this purpose, we have gathered data from IMDB website and have performed different SQL queries keeping in mind the user’s perspective and expectations.
ER MODELADDRESS(AddressId,StreetLine,ZipCode,CityID)AWARDCATEGORY(AwardId,AwardName,AwardTypeId)AWARDTYPE(AwardTypeId,AwardTypeName,Date,Description,Location)BOXOFFICE(BoxOfficeID, Budget,OpeningWeekend,Grossincome,MovieId)CELEBRITY(CelebrityId ,CelebrityName,DateOfBirth,PlaceOfBirth,ShortBiograpgy,Gender)CHANNEL(ChannelId,ChannelName)COMPANYCREDITS(CompanyCreditId,ProductionCompany,MovieID)DIRECTORS(DirectorID, DirectorName,ShortBiography,DateOfBirth,PlaceOfBirth)EPISODE(EpisodeID, EpisodeNumber,EpisodeName,Description,SeasonID)GENRE(GenreID, GenreName)MOVIE(MovieID,MovieName,MovieShortDescription,ReleaseDate,MovieDuration,TelevisionContentRatingSys
tem,WatchTrailerURL,MovieRating,MovieTotalVotes,ReviewID)POLL(PollID,PollName,PollDescription,URL,FeaturedPoll,HOtlyContestedPoll,PollType,ThumbnailImage,Directo
rId,CelebrityID,TvShowID,MovieID,UserID)USERACCOUNT(UserID,FirstName,LastName,Email,Password,CityID)
Imdb_Toad.txp
SQL QUERIESMicrosoft Word
Document
Predicting Movie Rating and its Votes
SQL QUERIESAnalysis of Movie Collection per state per country.docx
Total Income Generated from Movies from each State and each Country
SQL QUERIESCreate a function to find out the co.docx
Function to find most Networked Celebrity
SQL QUERIESAvg income of movies on the basis of genre
Microsoft Word Document
ORACLE QUERIESAnalysis of Movie Collection per state per countr1.docx
Total Income Generated from Movies from each State and each Country
ORACLE QUERIES
Procedure for TVShowOnTonight.docx
TV Show on Tonight
HIVE DATABASE
List of all Tables.csv
List of all Tables generated in Hive
QUERY IN HIVE
Microsoft Excel Workbook
TABLEAU REPRESENTATION
TABLEAU REPRESENTATION
TABLEAU REPRESENTATION
ANALYSISThe model depicts gross collection and rating of the upcoming movies and the basis of
popularity of celebrity, director and previous movie reviews.
TABLEAU PRESENTATION
OUR LEARNINGS THE PROCESS/DIFFERENCES NOTED IN BETWEEN DATABASES
TRANSACTION CONTROL: A transaction can be defined as a group of operations or tasks that should be treated as a single unit. MS SQL Server will execute and commit each command/task individually, and it will be difficult or impossible to roll back changes if any errors are encountered along the way. To properly group statements, the “BEGIN TRANSACTION” command is used to declare the beginning of a transaction, and either a COMMIT statement is used at the end. In Oracle, each new database connection is treated as new transaction. As queries are executed and commands are issued, changes are made only in memory and nothing is committed until an explicit COMMIT statement is given.
ORGANIZATION OF DATABASES: MS SQL Server organizes all objects, such as tables, views, and procedures, by database names. Users are assigned to a login which is granted accesses to the specific database and its objects whereas in ORACLE, all the database objects are grouped by schemas, which are a subset collection of database objects and all the database objects are shared among all schemas and users.
REVERSE ENGG OF DATABASE: Tools like TOAD are used to reverse engineer a database in ORACLE while MSSQL has an inbuilt functionality for the same.
WORKING WITH TRIGGERS: MS SQL has a set based approach. Rows that are affected by a data modification (insert, update, delete) are stored in the inserted and deleted tables. In Oracle there are before and after triggers and a trigger can be defined to be executed per row or per statement. Oracle disallows access to other rows of the table as conceptually, the per row trigger is fired during the table modification for each row when the row is in the process of being modified.
WORKING WITH BLOB: ORACLE stores image URL’s as datatype BLOB and while retrieving it represents as (BLOB) whereas MSSQL datatype VARBINARY(MAX) and while retrieving data it represents as a set of binary characters.
SYNTACTICAL DIFFERENCES BETWEEN THE 2.
CONCLUSION Created and design IMDB database by checking the current version of imdb.com and built
an ER model with respect to it
Translated the ER model to normalized tables
Populated the real time data, analysed and extracted raw data from the imdb website
After incorporation of data in the databases (MsSQL, Oracle, Hive), the next step was to design an analysis model like estimating the gross collection of the movie on the basis of previous movie rating, celebrities, etc.
Analysed few general user questions on Tableau Software which provided Amazing Data Visualization and helps in faster data interpretation
The applications of the model are many and are not limited to the above sample only.