Using NARM along with genetic algorithm to find the pattern of association among the indicators of dataset
Objective
Finding a pattern of association among the indicators of the given dataset using numerical association rules which are generated via genetic algorithm
Data collection and preparation
For this project, we will use the SDG 5 dataset of the United Nations. This Dataset is cleaned and pre-processed. All the parameters of this dataset are renamed to I1, I2, I3, and so on. for ease of doing data analysis.
EDA
1. Understanding the makeup of the data
2. Understanding the Correlation between different parameters
3. Box Plots
Numerical rules generation using genetic algorithm
For this purpose, we will use quantminer
After the rules are generated, I copy them to a Text file. This is what the text file looks like -
1. support = 34 (91%) , confidence = 100 % : I1 in [12.1; 115.2] --> I2 in [0.0; 958.0]
2. support = 35 (94%) , confidence = 100 % : I2 in [0.0; 941.0] --> I1 in [4.1; 177.8]
3. support = 31 (83%) , confidence = 100 % : I1 in [17.2; 110.4] --> I3 in [0.53; 0.96]
4. support = 35 (94%) , confidence = 100 % : I3 in [0.53; 0.95] --> I1 in [4.1; 177.8]
After that, I use python code to transfer that data from a text file to a CSV file, and here is what it looks like.
Result / Outcome
Vizulations -
1.
The round circle in the center represents all the rules with 100% confidence and 100% support.
The circles coming out of the main circle is representing the antecedent in the rules.
The size of each circle is representing its degree i.e. how many rules it is part of.
Lastly, the circle originated from antecedent are consequent. Their sizes are also representing their degree
2. This visualization represents all the top rules (Rules with the highest support and confidence).
Each arc is representing a rule. they are sorted (high to low) from left to right.
As clear from the diagram, Most rule has I1, I3, I5, I7, AND I8 asantecedent
.png)
Comments
Post a Comment