Trends in Entrepreneurship: Insights from Big Data


In my final year at IITB, I took a course in Enterpreneurship offered by the school of management (we were required to take one course from outside the department). The course was a lot of fun, and we were free to do whatever we want for our final project.

I decided to verify if some of the assumed trends and biases related to Enterpreneurship were actually backed by data.

Interesting Findings:

  • What is more risky: Entrepreneurship or Flying?

We next plot the frequency of phrases in books (using Google n-gram corpus) where the word “risk” has modified the word “Entrepreneurship”. As following figure shows, the phrases where entrepreneurship has been modified by risk outnumber the phrases where flying has been modified by risk! This is even after figure 4 shows that flying has a higher frequency than Entrepreneurship.

  • Young vs. old; Male vs. Female

We also plot “young entrepreneur”, “old entrepreneur”, “male entrepreneur” and “women entrepreneur”. As figure 6 shows, books talk about “young entrepreneur” more than “old entrepreneur”, which is somewhat expected. The high frequency of the phrase “woman entrepreneur” can perhaps be explained by noting that there has been a rise in gender equality movements around the globe in the recent times. The term “male entrepreneur” is quite unnatural and uncommon, which is likely the reason for its low frequency.

  • Enterpreneurship as a career

Twitter N-gram corpus

We want to answer the following two questions using this dataset:

  • Is one of the genders clearly more interested in entrepreneurship from a social media perspective?
  • Do people think about startups/entrepreneurship more on some days than the others?