Game Metadata Analysis

For our final project in DS3000 Foundations of Data Science, my team (2) and I gathered data from various video game platforms and ran metadata analysis on it in Python.


Tools, libraries, and languages used: Python, BeautifulSoup, pandas, numpy, scikit-learn

For this project, we gathered data from various video game platforms, and ran ML algorithms on them to try and find correlations/predictions between their audience ratings and various productions/metadata factors.

For the scraping of data, I helped write an HTML scraping python script using BeautifulSoup to gather data off of the SteamCharts website. I also wrote python scripts to gather data from the iGDB and Steam APIs. For our analysis, we cleaned and formatted the data using pandas and numpy, then trained and ran ML models from scikit-learn on our data. Although we weren’t able to get any significant correlations between the various factors we identified in games like release date, genre and tags, and playtime, it was still a great practice in gathering data, cleaning it, analyzing it, and running various ML models.

Our final analysis can be found in our Google Colab here

Updated: