Using NARM along with genetic algorithm to find the pattern of association among the indicators of dataset

Objective

Finding a pattern of association among the indicators of the given dataset using numerical association rules which are generated via genetic algorithm

Data collection and preparation

For this project, we will use the SDG 5 dataset of the United Nations. This Dataset is cleaned and pre-processed. All the parameters of this dataset are renamed to I1, I2, I3, and so on. for ease of doing data analysis.


EDA

1. Understanding the makeup of the data



2. Understanding the Correlation between different parameters




3. Box Plots








Numerical rules generation using genetic algorithm


For this purpose, we will use quantminer

After the rules are generated, I copy them to a Text file. This is what the text file looks like - 

1.  support = 34 (91%) , confidence = 100 %  :  I1 in [12.1; 115.2]   -->   I2 in [0.0; 958.0]
2.  support = 35 (94%) , confidence = 100 %  :  I2 in [0.0; 941.0]   -->   I1 in [4.1; 177.8]
3.  support = 31 (83%) , confidence = 100 %  :  I1 in [17.2; 110.4]   -->   I3 in [0.53; 0.96]
4.  support = 35 (94%) , confidence = 100 %  :  I3 in [0.53; 0.95]   -->   I1 in [4.1; 177.8]


After that, I use python code to transfer that data from a text file to a CSV file, and here is what it looks like.


Result / Outcome


Number of Rules




Vizulations - 

1. 

The round circle in the center represents all the rules with 100% confidence and 100% support.

The circles coming out of the main circle is representing the antecedent in the rules.

The size of each circle is representing its degree i.e. how many rules it is part of.

Lastly, the circle originated from antecedent are consequent. Their sizes are also representing their degree

2. 
This visualization represents all the top rules (Rules with the highest support and confidence).

Each arc is representing a rule. they are sorted (high to low) from left to right.

As clear from the diagram, Most rule has I1, I3, I5, I7, AND I8 asantecedent


Note

This project is similar to the Research paper that I have co-authored. You can check out the paper here

You can check out the GitHub repo. here


Comments

Popular posts from this blog

Implementing an unsupervised technique to classify the GitHub commits by their quality

A detailed analysis on correlations between Music and mental health