Jekyll2021-12-17T23:36:26+00:00https://igag9.github.io/Site_web_ada/feed.xmlData storyA website with blog posts and pagesAbstract2021-04-27T00:00:00+00:002021-04-27T00:00:00+00:00https://igag9.github.io/Site_web_ada/2021/04/27/dark-mode<p><strong>“She’s learning where she fits on the food chain, and I’m not sure you want her to figure that out” [Owen Grady, Jurassic World]</strong></p>
<p>In today’s world, gender inequality is a phenomenon that is largely present and the cinema industry makes no exception to the rule. Women have always been present in the cinema, but in what proportion? As inequalities have been largely highlighted by debates or studies during those previous years, a remaining question can be : what is the place of women in the cinema industry, as seen from the point of view of speakers on newspapers ?</p>
<p>The goal of this project is to answer this question. For that, it the will rely mainly on Quotebank, a database containing quotes attributed to individuals, which are extracted from newspapers between 2008 and 2020. This project will focus on quotes from 2018 and will pave the way for future studies on other years.</p>
<h2 id="research-questions">Research Questions</h2>
<ol>
<li>
<p>Who, in terms of gender, is talking about movies in the newspapers?</p>
</li>
<li>
<p>What are the characteristics of movies that impact their memorability over time? Does the gender have an influence on a movie’s memorability?</p>
</li>
<li>
<p>Is it possible to predict the rating of a movie by using Quotebank and determine if gender has an impact on it?</p>
</li>
</ol>“She’s learning where she fits on the food chain, and I’m not sure you want her to figure that out” [Owen Grady, Jurassic World]Datasets selection2019-06-30T00:00:00+00:002019-06-30T00:00:00+00:00https://igag9.github.io/Site_web_ada/2019/06/30/sample-post<p><strong>Three databases were used to answer the research questions :</strong></p>
<!--more-->
<!-- Page Content -->
<div class="container">
<div class="row">
<div class="col-lg-6 mb-4">
<div class="card h-100">
<a href="#"><img class="card-img-top" src="https://upload.wikimedia.org/wikipedia/commons/2/27/Wikidata_barcode.svg" alt="" /></a>
<div class="card-body">
<h4 class="card-title">
<a href="#">Wikidata</a>
</h4>
<p class="card-text">The Wikidata database was used because it contains information about people, films, etc. In order to answer our research questions, we need to extract the gender and age of movies' protagonists (speakers, producers, actors), and the movies' release date. Metadata about Quotebank's speakers are already provided in files speaker attributes.parquet. The informations for other people will be retrieve via the Wikidata API.</p>
</div>
</div>
</div>
<div class="col-lg-6 mb-4">
<div class="card h-100">
<a href="#"><img class="card-img-top" src="https://logowik.com/content/uploads/images/imdb-internet-movie-database5351.jpg" alt="" /></a>
<div class="card-body">
<h4 class="card-title">
<a href="#">IMDb</a>
</h4>
<p class="card-text">Internet Movie Database (IMDb) is an open source database that provides informations about movies. This database is available online, separated in six datasets. This datasets were concatenated and treated in one single dataset. Columns were filtrated regarding of the needs of the research. Columns of interest are the following : title, year, genres, crew (producer, cinematographer, writer), actors, ratings.</p>
</div>
</div>
</div>
<div class="col-lg-6 mb-4">
<div class="card h-100">
<a href="#"><img class="card-img-top" src="http://www.benoitmeylan.ch/ressources/img/Quotebank.png" alt="" /></a>
<div class="card-body">
<h4 class="card-title">
<a href="#">Quotebank</a>
</h4>
<p class="card-text">As said the 2018 data from quotebank was retrieved. To select the quotes talking about cinema, a selection using the names of films was made.</p>
</div>
</div>
</div>
</div>
<!-- /.row -->
</div>
<!-- /.container -->Three databases were used to answer the research questions :Who, in terms of gender, is talking about film in the media?2019-05-18T00:00:00+00:002019-05-18T00:00:00+00:00https://igag9.github.io/Site_web_ada/2019/05/18/color-post<p>For this question, we need to retrieve gender and age of speakers. This is done with the wikidata parquets. Then, an analysis of these variable can be realized using aggregation, and then statistic and/or distribution.</p>
<div class="row">
<div style="flex: 33.333333333333336%">
<img class="single" src="/Site_web_ada/assets/img/feature-img/q1.jpeg" alt="q1.jpeg" />
</div>
</div>
<p>It can be observed that there are more men who talk about cinema than women. However, if we consider all the subjects in quotebank, we can observe that women are better represented in the cinema world.</p>For this question, we need to retrieve gender and age of speakers. This is done with the wikidata parquets. Then, an analysis of these variable can be realized using aggregation, and then statistic and/or distribution.Does the gender impact the memorability ?2017-09-17T00:00:00+00:002017-09-17T00:00:00+00:00https://igag9.github.io/Site_web_ada/2017/09/17/Use-Bootstrap<p>For this question we hypothesize that for a given film, the more people who talk about the film the more memorable it will be.
Then when the film is made by many men, it will be more memorable.</p>
<p>The graphs show the proportion of men according to the memorability of the films.</p>
<p><strong>The first two graphs show the proportion of speakers :</strong></p>
<div class="row">
<div class="column">
<img src="/Site_web_ada/assets/img/feature-img/Q2_a.jpeg" alt="Q2_a.jpeg" />
</div>
<div class="column">
<img src="/Site_web_ada/assets/img/feature-img/Q2_b.jpeg" alt="Q2_b.jpeg" />
</div>
</div>
<p>Both graphs show that the hypothesis is rejected</p>
<p><strong>The third shows the proportion of participants in the creation of the films :</strong></p>
<div class="row">
<div style="flex: 33.333333333333336%">
<img class="single" src="/Site_web_ada/assets/img/feature-img/Q2_c.jpeg" alt="Q2_c.jpeg" />
</div>
</div>
<p>The graph, like the other, show that the hypothesis is rejected</p>For this question we hypothesize that for a given film, the more people who talk about the film the more memorable it will be. Then when the film is made by many men, it will be more memorable.Is it possible to predict the rating of a movie by using Quotebank and determine if gender has an impact on it?2014-11-29T00:00:00+00:002014-11-29T00:00:00+00:00https://igag9.github.io/Site_web_ada/2014/11/29/feature-images<p>We wonder if it’s possible to predict movie’s ratings regarding several features.</p>
<p>What are the <strong>features</strong>?</p>
<ul>
<li><em>gender_speaker_pct</em> : proportion of male (for the speakers)</li>
<li><em>gender_movie_pct</em> : proportion of male (for all participants in the creation of the film)</li>
<li><em>nb_quoteID_norm</em> : number of quotes occurences (standardized)</li>
<li>memorability:
<ul>
<li><em>gender_movie_pct</em> : proportion of male (for all participants in the creation of the film)</li>
<li><em>days_mean_norm_before</em>: average number of days before the release date (standardized)</li>
<li><em>days_max_norm_before</em> : number of days from the first day we talk about it until the release (standardized)</li>
<li><em>nb_quote_month_norm_before</em>: the number of situations one month before the exit (standardized)</li>
<li><em>days_mean_norm</em>: average number of days after the release date (where it is discussed) (standardized)</li>
<li><em>days_max_norm</em> : number of days until the last day that it is discussed (standardized)</li>
<li><em>nb_quote_month_norm</em> : the number of situations one month after discharge (standardized)</li>
</ul>
</li>
</ul>
<p>What is the <strong>response</strong>?</p>
<ul>
<li>The rating of a film</li>
</ul>
<div class="row">
<div style="flex: 33.333333333333336%">
<img class="single" src="/Site_web_ada/assets/img/feature-img/Q3_a.jpeg" alt="Q3_a.jpeg" />
</div>
</div>
<p>As we can see on the graph and the value of the r2=-0.10, it does not seem possible to predict the score with the proposed features</p>We wonder if it’s possible to predict movie’s ratings regarding several features.