Posts

A detailed analysis on correlations between Music and mental health

Image
Objective  Find if any correlations exist between an individual's music taste and their self-reported mental health. Result/Outcome First, we will look at basic insights from our analysis to understand data better and after that, we will discuss how music affects a person's health People of age 16 - 24 are more frequently listening to music. People working for 1-3 hours tend to hear music more than working more hours. Most songs are heard between the range of 100-125 BPM (Beats per minute) This shows that Spotify is the most used streaming service followed by youtube music. Rock is the most loved genre followed by pop and metal Nearly 80% of people listen to music while working Every listener has a Depression level above 3. Lofi, Hip hop, and Rock listeners have Depression levels above 5. The maximum OCD level is above 3 in Rap and Lofi listeners. Every listener has an insomnia level below 4 besides Metal, Lofi, and Gospel. Every listener has an Anxiety level above 4 but Rock, ...

Using NARM along with genetic algorithm to find the pattern of association among the indicators of dataset

Image
Objective Finding a pattern of association among the indicators of the given dataset using numerical association rules which are generated via genetic algorithm Data collection and preparation For this project, we will use the SDG 5 dataset of the United Nations. This Dataset is cleaned and pre-processed. All the parameters of this dataset are renamed to I1, I2, I3, and so on. for ease of doing data analysis. EDA 1. Understanding the makeup of the data 2. Understanding the Correlation between different parameters 3. Box Plots Numerical rules generation using genetic algorithm For this purpose, we will use quantminer After the rules are generated, I copy them to a Text file. This is what the text file looks like -  1.  support = 34 (91%) , confidence = 100 %  :  I1 in [12.1; 115.2]   -->   I2 in [0.0; 958.0] 2.  support = 35 (94%) , confidence = 100 %  :  I2 in [0.0; 941.0]   -->   I1 in [4.1; 177.8] 3....

Implementing an unsupervised technique to classify the GitHub commits by their quality

Image
Project Inspiration - We all have seen projects where we try to classify the quality of projects based on some stats to achieve some objective. This is a similar project but with the intent to build a system that can potentially help the developer to efficiently manage his project by giving high priority to certain commits.  Objective - Build a system that can classify the GitHub commits on basis of their quality using an unsupervised method (K-medoids and random forest) Result / Outcome -  The algorithm has divided the commits into three categories i.e. Cluster 1, Cluster 2, and Cluster 3. Cluster 1 will represent low-quality commits, cluster 2 will represent mid-quality commits and cluster will represent high-quality commits. his Figure tells us how the algorithm classified more than 300,000 commits. Below are Tables generated to let us know the properties of each cluster. Performance metrics The performance metric table is generated by a random forest algorithm. It tells u...