November 28, 2019

Fantastic Programming Project Ideas and Where To Find Them (a beginner-friendly version)

20 cool project ideas for data science, machine-learning app development, and web development

We all know that working on personal projects is a really important part of learning. It’s honing the discipline. Moving from theory to practice. A way of learning by doing.

It’s building to learn.

But isn’t the task of getting good project ideas the one that also prevents you from building stuff, in the first place? Isn’t it a narrow bottleneck?

Aren’t you tired of those lists of programming project ideas that suggest you analyze the Titanic dataset or a flower dataset or build things like a to-do list app, a snake game, a calculator, an ecommerce website, or something else that no one is going to find cool?

I find them really boring because no one is ever going to be excited about using the final result. To be honest, not even me.

I believe that you can train your mind to get better ideas — anyone can think of good ideas. If you look at enough of these projects and maybe work on a few, your mind will learn to recognize cool things that will be interesting to work on.

So, here I present some project ideas that I find really cool, grouped by their sources — my goldmines of fantastic project ideas.

Source 1:

Browsing through other people’s hackathon projects is a great way to come across good project ideas, because:

  • A lot of them are just small, cozy, warm pet projects: being personal projects built by young programmers in just 12 or 24 or 48 hours, they are definitely doable. (Yes, you can do it!)
  • You will know that you are building something worthwhile: most of them are hackathon-winning projects

How cool would it be if you created:

1. Search inside a YouTube video

A web application that lets you search inside a YouTube video and gets you to the point where that word is uttered — a Ctrl-F capability for videos.

And tricked yourself to learn Python, basic web development (HTML/CSS, Javascript).

2. A browser extension that refers you to a story with an opposite political view to the one you are reading

This will combat the effects of newsfeeds that allow people to only see posts on social media and news sites that agree with their point of view.

And tricked yourself to learn basic web development (HTML/CSS, JavaScript, jQuery), maybe some machine learning.

3. A web app to plan travel

A web app that allows you to enter the day and place where you plan to travel to and the amount of money you will be bringing along, and will provide you useful information about weather conditions and the value of your money.

And tricked yourself to learn basic web development (HTML/CSS/JavaScript), APIs.

4. A simple notifications app to block notifications

A notifications app that lets you select messaging apps that you want to block notifications from when the frequency of notifications exceeds one every three seconds.

And tricked yourself to learn Android development.

5. A messaging app that automatically sends texts

A messaging app that automatically sends a text to your loved ones letting them know that you’ve reached a particular destination because you often forget to do so.

And tricked yourself to learn Android app development.

Source 2: Kaggle

I believe that if you want to get into data science/ML, Kaggle is your one-stop shop to learn and practice the craft.

  • Datasets: With around 300 competition challenges, all accompanied by their public datasets, and 9500+ datasets in total (and more being added constantly) this place is like a treasure trove of data science/ML project ideas.
  • Kernels: All the datasets have a public kernels tab where people can post their analysis for the benefit of the entire community. So, anytime you feel like you don’t know what to do next, you can be sure to get some ideas by looking at those kernels. Besides, a lot of those kernels are written especially to help beginners.
  • Courses: This tab contains free, practical, hands-on courses that cover the minimum prerequisites needed to quickly get started in the field. The best thing about them? Everything is done using Kaggle’s kernels (described above). This means that you can interact and learn… no more passive reading through hours of learning material!

6. Spotify’s worldwide daily song dataset

This dataset contains the daily ranking of the 200 most-listened-to songs in 53 countries from 2017 and 2018 by Spotify users. It contains more than two million rows, which comprises 6629 artists, 18598 songs, for a total count of 105-billion streams.

And find answers to:

  • How long do songs stay in the top 3, 5, 10, 20 in your country? Which songs are the outliers?
  • Which countries have similar tastes in music?
  • How much time does a top-ranking song take to get into the ranking of neighboring countries?

7. Young people survey dataset

This explores the preferences, interests, habits, opinions, and fears of young people.

1010 students were asked questions regarding their:

  • Music preferences
  • Movie preferences
  • Hobbies and interests
  • Phobias
  • Health habits
  • Personality traits, views on life, and opinions
  • Spending habits
  • Demographics

8. Dark Net marketplace dataset

I find the Dark Net simply fascinating.

This is a data parse of marketplace data ripped from Agora (a dark/deep web) marketplace from the years 2014 to 2015. It contains drugs, weapons, books, services, and more.

Here’s some inspiration:

  • Description of this dataset:
“This data set was made from an HTML rip made by Reddit user “usheep” who threatened to expose all the vendors on Agora to the police if they did not meet his demands (sending him a small monetary amount, a few hundred dollars in exchange for him not leaking their info).
Most information about what happened to “usheep” and his threats is nonexistent. He posted the HTML rip and was never heard from again. Agora shut down a few months after.
It is unknown if this was related to “usheep” or not, but the raw HTML data remained.” —

Facebook hacking guide, ATM hacking tutorial, 50000 facebook likes, fake IDs, licenses, lots of drugs and prostitution-related entries — the kinds of items in this dataset

9. News headlines of India

This contains 18 years of headlines focussing on India.

It contains approximately 2.9 million events published by the Times of India from 2001 to 2018.

You could use this to:

  • Run a sentiment analysis on the headlines and see for yourself — do the news agencies focus on bad news more than good news?
  • Understand what the most popular topics are in the Indian society.
  • Chop this dataset into a smaller piece for a more focused analysis on categories like Bollywood, political parties, cricket, and see the trend over the years

10. StackOverflow Developer Survey of more than 100,000 developers

You could use this meaty survey to arrive at data-backed answers to the following questions:

  • Do people learn by contributing to open-source projects?
  • How do opinions about AI differ across countries/age/dev roles?
  • Views and opinions of the students (one out every five responders in this survey is a student)
  • How vim users differ from non-vim users?
  • Create a salary predictor.

I used it to make a comparison of software developers in India with the ones in the US, UK, Germany, and the entire world at large.

Source 3: Data is Plural

This is yet another source for data science or machine-learning projects. It is a free email newsletter where the author sends you bunch of curious datasets each week.

Why you should analyze curious datasets for your personal projects:

  1. They are thrilling to work on — you are curious about knowing the results of the analysis yourself.
  2. They are an easy way to create interesting projects — even a simple analysis of a dataset that is inherently interesting, will be interesting.

Alright, so here are some cool ones from Data is Plural’s archives:

11. A dataset of 2,656 TED talks, with metadata and transcripts

TED talks have become an integral part of our culture.

“A group of teenagers cluster near their lockers, enjoying quick conversations between classes. One of them goes a little too long and, realizing it, addresses the group and the situation by announcing: “Well, thanks for coming to my TED talk.”
The rest laugh, nod their heads, and the conversational flow returns to normal before the bell sounds announcing that classes are about to begin.” — Field notes by one of the authors,

Analyze these transcripts to reveal some intricacies about our culture.

12. How couples meet and stay together

It is a survey of 4,002 adults, 3,009 of those had a spouse or main romantic partner. It even has follow-up surveys that were implemented one and two years after the main survey, to study couple dissolution rates.

An analysis can reveal answers to the following questions:

  • Do traditional couples and nontraditional couples meet in the same way? What kinds of couples are more likely to have met online?
  • Did the most recent marriage cohorts (especially the traditional heterosexual same-race married couples) meet in the same way their parents and grandparents did?
  • Does meeting online lead to greater or less couple stability?
  • How do the couple dissolution rates of nontraditional couples compare to the couple dissolution rates of more traditional same-race heterosexual couples?
  • How does the availability of civil union, domestic partnership, or same-sex marriage rights affect couple stability for same-sex couples?

13. Electricity in rural India

Smart Power India and the Initiative for Sustainable Energy Policy published a survey dataset that “covers 10,000 households and 2,000 rural enterprises across 200 villages in Bihar, Uttar Pradesh, Odisha, and Rajasthan.”

Respondents were asked, among other things, how many hours per day they get electricity, whether they have solar panels, and the price they pay for kerosene.

Do an analysis to understand exactly how dire the state is of rural India and compare them with your own conditions.

14. Deaths in jobs

Since 1992, the US Bureau of Labor Statistics has collected data on work-related deaths through its Census of Fatal Occupational Injuries.

You could do a detailed study of the jobs to avoid, maybe?

15: A dataset of sarcasm in TV shows like Friends and The Big Bang Theory

MUStARD is a corpus of 690 text and video clips “for research in automated sarcasm discovery.”

The dataset’s 690 examples — half involving sarcasm, half not — come from Friends, The Golden Girls, The Big Bang Theory, and Sarcasmaholics Anonymous.

I bet there are lots of interesting things we could do with this hilarious dataset!

Source 4: Y. O. U.

Oh yes, I did that!

I wrote in the beginning — you can train your mind to come up with good ideas yourself.

I think Paul Graham’s advice on how to find startup ideas also kind of applies to how to find your pet project ideas.

The way to get startup ideas is not to try to think of startup ideas. It’s to look for problems, preferably problems you have yourself.

At the same time, and this may sound like I’m contradicting myself, you don’t want to set the bar too high.

You may have watched the movie The Social Network too much and hope to make the next Google or Facebook out of this project. But you shouldn’t. This will only slow down the learning, make you create unrealistic goals, and most dangerously, make you procrastinate.

Remember, your goal is not to write a billion-dollar software. It is to create a program that is going to provide a stage for you to work on and simply learn from. Like, for instance:

16. While chatting with my friend, we discussed how cool it would be to build a tool to analyze our Whatsapp chats and reveal things like the number of messages sent, number of words sent, average number of words per message, most common words, longest double-texting streak, chat hour pattern, most shared website links, and more.

We later found out that we had rediscovered an idea that was really popular on Reddit once.

What’s awesome was that in the process of building it, she turned her Python skills up a notch. Now, we might even try our hands on web development and build a website that allows anyone to run an analysis on their own chat file!

17: I use Chrome bookmarks a lot. I really need to add comments to my bookmarks so that I could save my motivation for bookmarking that awesome link. But Chrome browser doesn't have an option to comment. That is why I built a simple Chrome extension to help me add comments to my bookmarks!

And I tricked myself to learn - JavaScript, jQuery, HTML.

18: When Game of Thrones released its last season a few months ago, I thought of building a script to analyze the sentiment of tweets of various Game of Thrones seasons to learn just how bad the last season was (😜)
Do this and you can trick yourself to learn - Python, Machine Learning, NLP

19: And since your goal is to learn, you shouldn't feel bad about reimplementing some existing idea. One day I came across this popular post on Hacker News called "I taught my little brother JS and he built this videogame in a week". I checked out the game and it was kind of addictive but really simple. I told my above mentioned friend about it and we are building a Python version of this cool game using PyGame.

20: A simple app that reminds you to follow up with important, busy people that you want to connect with. I recently read an article by Alexey Guzey on how you shouldn't expect busy people to reply to your first message and how it is your responsibility to follow up with them. But when you have a bunch of important people to talk to, it can be a little difficult to keep track of the follow-ups. This app will do it for you and also remind you about future follow-ups.

3 pointers on how to come up with (sort of) cool ideas --

  • Keep your eyes open
  • Set a low bar
  • Don't hesitate to reimplement


So, here’s three final pointers on how to come up with (sort of) cool ideas:

  • Keep your eyes open.
  • Set the bar low.
  • Don’t hesitate to reimplement.

This is definitely not an exhaustive list of sources of cool project ideas. There are a lot more goldmines like this out there but, of course, they are difficult to find. I’ll update this post as I discover more of them.

I’ll announce any updates to this article on my Twitter, on Build To Learn newsletter and in Build To Learn Slack group.

So, follow and subscribe to keep in touch.

Also you can reach out me on both Twitter and LinkedIn.

Next in this series of articles, I take apart the above projects one-by-one and give you a detailed roadmap of building and learning on the way.

I have started with the WhatsApp chat analyzer project: