This package is meant for implementing the Apriori algorithm as a microservice.
curl -s http://getcomposer.org/installer | php
Add the package to your composer.json file
{
"require": {
"codedheartinside/apriori": "1.*"
}
}
Download the files
php composer.phar install
Add the autoloader for the files into your project
require 'vendor/autoload.php';
To set up the running environment for the package, run the installer
$installer = new \CodedHeartInside\DataMining\Apriori\Installer();
$installer->createRunningEnvironment();
You first need to create a configuration with the rules for the algorithm
$aprioriConfiguration = new \CodedHeartInside\DataMining\Apriori\Configuration();
// Configuring the boundries is optional
$aprioriConfiguration->setDisplayDebugInformation();
$aprioriConfiguration->setMinimumThreshold(2) // Default is 2
->setMinimumSupport(0.2) // Default is 0.1
->setMinimumConfidence(5) // Default is 0.2
;
After that, all is set to run the algorithm on a data set. The data set can be added through the addDataSet function.
$dataSet = array(
array(1, 3, 4),
array(2, 4, 6),
array(1, 2),
array(5),
);
$dataInput = new \CodedHeartInside\DataMining\Apriori\Data\Input($aprioriConfiguration);
$dataInput->flushDataSet()
->addDataSet($dataSet)
->addDataSet($dataSet) // In this case, the data set is added twice to create more testing data
;
To run the the algorithm on the data set, provide the Apriori class with the configuration and call the run function.
$aprioriClass = new \CodedHeartInside\DataMining\Apriori\Apriori($aprioriConfiguration);
$aprioriClass->run();
After running the algorithm, the records with the statistics for support and confidence become retrievable.
Support is the time a item combination occurs in all of the provided item sets.
To get the records with the support statistics:
foreach ($aprioriClass->getSupportRecords() as $record) {
print_r($record);
// Outputs:
// Array
// (
// [itemIds] => Array
// (
// [0] => 1
// [1] => 4
// [2] => 6
// [3] => 7
// )
//
// [support] => 0.060606060606061
// )
}
Confidence is the times a article occurs in combination with the other items
To get the records with the confidence statistics
foreach ($aprioriClass->getConfidenceRecords() as $record) {
print_r($record);
// Outputs
// Array
// (
// [if] => Array
// (
// [0] => 1
// [1] => 7
// )
//
// [then] => 3
// [confidence] => 1
// )
}