Data Analysis: Violence Against Women and Girls

I just finished reading the book ‘Invisible Women: Data Bias in a World Designed for Men’ by Caroline Criado Perez. The book is all about how we can use data to show that women have been overlooked, unheard, misunderstood, and misrepresented throughout time. 


However, it doesn’t simply use data that exists. In many cases, it highlights data sets with data gaps that don’t account for male or female. These data sets have informed many business, political, social, economic, and infrastructure decisions throughout history. The consequences from not disaggregating this data results in many situations that are inconvenient, ignorant, and even dangerous for women and girls. Typically, this disaggregation has a default male bias, which not only fails women, but us as an entire civilization.

There are myriad examples throughout the book where women suffer as a result of culture, society, education, language, business environments, products, and government institutions being built for the default male human. 

Gender Bias in Car Crash Safety Tests

One example that struck me as particularly wrong was car crash testing. I’m sure we’ve all seen crash test dummies – 170lb Ken dolls made to simulate the ‘average driver.’ However, males and females have different body shapes, builds, muscle and bone densities, etc. Testing car crash safety ratings against a dummy that models the average for only half of the population leaves women severely at risk. One would think tests on male and female crash test dummies would be required by law in the US, but that is not the case. And, of course, if it costs some extra money and doesn’t add to the bottom line, companies will exclude it.

This example stood out to me for two reasons: 1.) I was shocked to find out that female crash test dummies were not required by law for safety tests and 2.) there’s a billboard for Volvo near a highway exit by my house that I drove by often touting that their cars are designed for safe use for all people, not just the average male. I hadn’t really thought about what that meant until I read the statistics in Invisible Women about how women are more likely to sustain more severe injuries than men in a car crash.

Including Women in the Conversation

The author of the book cites many occasions where women are overlooked through either decisions made from gender-aggregated data or being excluded from the decision-making process altogether.

In simply one aspect, the author writes, “When we exclude half the population from knowledge production, we severely miss out on profound innovations and insights that could be gained.”

While reading this book, I was talking about the appalling statistics to my wife and she said “what are you going to do about it?” (I know – she’s incredible)

One thing it has spurred me on to do is explore different data sets as they pertain to women.

The following is my Exploratory Data Analysis of a dataset from Operation Fistula of survey answers about allowing violence against women and girls.


This data set is from Operation Fistula. It is conducted of men and women, aggregated by age, and from 70 different countries, primarily in Africa and Asia, between 2017-2018.

Recipients were asked if they agreed with the following statements:

  • A husband is justified in hitting his wife if she burns the food.
  • A husband is justified in hitting his wife if she argues with him.
  • A husband is justified in hitting his wife if she goes out without telling him.
  • A husband is justified in hitting his wife if she neglects the children.
  • A husband is justified in hitting his wife if she refuses to have sex with him.
  • A husband is justified in hitting his wife for at least one specific reason.

The data set then features a value representing the percentage of that demographic that agree with the statement. The higher the value, the more people agree with this reason for domestic violence.

Resources: Full data set, data set description, data set dictionary, and my Jupyter Notebook for making these plots are all available via my Github profile here.

Data Plots

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: