Overview


Intelli ML makes machine learning, and AI more accessible and more performant. Use this software platform for building custom models on the cloud. Enterprises can rapidly deploy instances fully optimized with blazing performance. The Intelli ML platform includes the most popular machine learning frameworks and their dependencies, and it is built for easy and rapid deployment. Users can also learning Machine Learning in the platform itself and become a professional while using the platform to build ML solutions.


Beginner's Guide


Artificial intelligence will shape our future more powerfully than any other innovation this century. Anyone who does not understand it will soon find themselves feeling left behind, waking up in a world full of technology that feels more and more like magic.

Machine Learning (ML) is defined as the use algorithms and computational statistics to learn from data without being explicitly programmed. It is a subsection of the artificial intelligence domain within computer science. While the field of machine learning did not explode until more recently, the term was first coined in 1959 and the most foundational research was done throughout the 70’s and 80's. Machine learning’s rise to prominence today has been enabled by the abundance of data, more efficient data storage, and faster computers.

Terminologies

Training Data : The data you use for training your model. It contains all the information you have collected about the problem statement.

Test Data : The data you use for testing the model. You can make predictions on this data.

Features : These are all the columns in your dataset which you use for training your model.

Class/Label : This is the column which identifies the particular record and the one you want to predict.

Accuracy : This is the percentage of data correctly predicted when you apply the model to your data.


Quickstart


Following : 3 services are provided by Phoenix ML as part of Release 1.2.0.

  • Classification : Classify entities into binary or multi classes.

  • Regression : Predict a value from a continuous range.

  • Feedback : Training a model with new set od data.

  • Clustering : Cluster your dataset based on yout business needs.


APIs


All URIs below are relative to https://studio.spotflock.com

Train a Classification Model POST  /api/v1/ml-service/phoenix-ml/classification/train
Train a Regression Model POST  /api/v1/ml-service/phoenix-ml/regression/train
Predicting from Classification Model POST  /api/v1/ml-service/phoenix-ml/classification/predict
Predicting from Regression Model POST  /api/v1/ml-service/phoenix-ml/regression/predict
Feedback from Regression Model POST  /api/v1/ml-service/phoenix-ml/regression/feedback
Feedback from Classification Model POST  /api/v1/ml-service/phoenix-ml/classification/feedback
Cluster Model POST  /api/v1/ml-service/phoenix-ml/cluster
Get Job Status GET    /api/v1/ml-service/phoenix-ml/job/status?id={id}
Get Job Output GET  /api/v1/ml-service/phoenix-ml/output/findBy?jobId={id}

Train a Classification Model

Description

This API would enable you to train a classification model. The model takes some time to be trained and thus the job status has to be checked. Once the job is completed, the job output API would give you the model info.

URI

POST  /api/v1/ml-service/phoenix-ml/classification/train

Headers

api-key Your App's API Key

Attributes

library spotflock / weka
service classification
task train
config.name Name of the model
config.algorithm Name of the algorithm. See the list of algorihtms below.
config.datasetUrl Path of the train data after uploading to cloud storage. See here for more Info.
config.label Column name to be predicted.
config.features List of column names for training.
config.trainPercentage Percentage of data used for training.Rest gets used for evaluating the model.
config.saveModel true / false
config.params Any configurations required for libraries.

Request Example

{
  "library": "weka",
  "service": "classification",
  "task": "train",
  "config": {
    "name": "Player Churn Model",
    "algorithm": "NaiveBayesBinomial",
    "datasetUrl": "/spotflock-studio/library/player_train.csv",
    "label": "player_activity",
    "trainPercentage": 80,
    "features": ["stamina","challenges","achievements"],
    "saveModel": "true",
    "params": {}
  }
}

Response

{
    "code": 200,
    "data": {
        "jobId": 969,
        "appId": 1558586024244,
        "name": "weka_classification_train",
        "library": "weka",
        "service": "Classification",
        "task": "TRAIN",
        "state": "RUN",
        "startTime": "2019-06-21T04:30:54.283+0000",
        "endTime": null,
        "request": {
            "library": "weka",
            "config": {
                "name": "Player Churn Model",
                "algorithm": "NaiveBayesBinomial",
                "datasetUrl": "/spotflock-studio/library/player_train.csv",
                "label": "player_activity.Grid",
                "trainPercentage": 80,
                "saveModel": "true",
                "params": {},
                "features":  ["stamina","challenges","achievements"]
            }
        },
        "isStreamJob": false,
        "isJobStopped": null
    }
}

Predicting from a Classification Model

Description

This API would enable you to predict a classification model. Once the job is completed, the prediction output API would give you the file info from which you can get the predictions.

URI

POST  /api/v1/ml-service/phoenix-ml/classification/predict

Headers

api-key Your App's API Key

Request Example

{
  "library": "weka",
  "service": "classification",
  "config": {
    "datasetUrl": "/spotflock-studio/library/player_test.csv",
    "modelUrl":"/spotflock-studio/1/1550423221357-NaiveBayesMultinomial.mdl",
    "params":{

       }
    }
}

Response

{
    "code": 200,
    "data": {
        "jobId": 970,
        "appId": 1560322200284,
        "name": "weka_classification_predict",
        "library": "weka",
        "service": "Regression",
        "task": "PREDICT",
        "state": "RUN",
        "startTime": "2019-06-21T04:33:30.418+0000",
        "endTime": null,
        "request": {
            "library": "weka",
            "config": {
                "modelUrl": "/spotflock-studio/1/1550423221357-NaiveBayesMultinomial.mdl",
                "datasetUrl": "/spotflock-studio/library/player_test.csv",
                "features":  ["stamina","challenges","achievements"]
            }
        },
        "isStreamJob": false,
        "isJobStopped": null
    }
}

Train on new dataset with already built model from a Classification Model

Description

This API would enable you to train on already built classification model with new dataset with same features and algorithm. Once the job is completed, the job output API would give you the model info.

URI

POST  /api/v1/ml-service/phoenix-ml/classification/feedback

Headers

api-key Your App's API Key

Request Example

{
	"library":"weka",
	"service":"Classification",
	"task":"FEEDBACK",
	"config":{
		"name":"Player Churn Model",
		"algorithm":"NaiveBayesBinomial",
		"datasetUrl": "/spotflock-studio/library/player_feedback.csv",
		"modelUrl": "/spotflock-studio/library/1550423221357-NaiveBayesMultinomial.mdl",
		"feedbackDatasetUrl":"/spotflock-studio/library/player_feedback.csv",
		"features":["stamina","challenges","achievements"],
		"trainPercentage": 80,
		"label": "player_activity",
		"saveModel":true,
		"params":{}
	}
}

Response

{
    "code": 0,
    "data": {
        "jobId": 971,
        "appId": 1560322200284,
        "name": "weka_classification_feedback",
        "library": "h2o",
        "service": "Classification",
        "task": "FEEDBACK",
        "state": "RUN",
        "startTime": "2019-06-21T04:37:47.999+0000",
        "endTime": null,
        "request": {
            "library": "weka",
            "config": {
                "name": "Player Churn Model",
                "label": "player_activity",
                "params": {},
                "features": ["stamina","challenges","achievements"],
                "algorithm": "NaiveBayesBinomial",
                "saveModel": "true",
                "datasetUrl": "/spotflock-studio/library/player_feedback.csv",
                "trainPercentage": 80,
                "feedbackDatasetUrl": "/spotflock-studio/library/player_feedback.csv",
                "modelUrl": "/spotflock-studio/library/1550423221357-NaiveBayesMultinomial.mdl"
            }
        },
        "isStreamJob": false,
        "isJobStopped": null
    }
}

Train a Regression Model

Description

This API would enable you to train a regression model. The model takes some time to be trained and thus the job status has to be checked. Once the job is completed, the job output API would give you the model info.

URI

POST  /api/v1/ml-service/phoenix-ml/regression/train

Headers

api-key Your App's API Key

Attributes

library spotflock / weka/ scikit/ h2o
service regression
task train
config.name Name of the model
config.algorithm Name of the algorithm. See the list of algorihtms below.
config.datasetUrl Path of the train data after uploading to cloud storage. See here for more Info.
config.label Column name to be predicted.
config.features List of column names for training.
config.trainPercentage Percentage of data used for training.Rest gets used for evaluating the model.
config.saveModel true / false
config.params Any configurations required for libraries.

Request Example

{
  "library": "spotflock",
  "service": "regression",
  "task": "train",
  "config": {
    "name": "Housing Price Model",
    "algorithm": "LinearRegression",
    "datasetUrl": "/spotflock-studio/library/hp_train.csv",
    "label": "price",
    "trainPercentage": 80,
    "features": ["area","parking_area"],
    "saveModel": "true",
    "params": {}
  }
}

Response

{
    "code": 200,
    "data": {
        "jobId": 971,
        "appId": 1558586024244,
        "name": "weka_regression_train",
        "library": "weka",
        "service": "Regression",
        "task": "TRAIN",
        "state": "RUN",
        "startTime": "2019-06-21T04:30:54.283+0000",
        "endTime": null,
        "request": {
            "library": "weka",
            "config": {
                "name": "Housing Price Model",
                "algorithm": "LinearRegression",
                "datasetUrl": "/spotflock-studio/library/hp_train.csv",
                "label": "price",
                "trainPercentage": 80,
                "saveModel": "true",
                "params": {},
                "features":  ["area","parking_area"]
            }
        },
        "isStreamJob": false,
        "isJobStopped": null
    }
}

Predicting from Regression Model

Description

This API would enable you to get predictions from regression model. Once the job is completed, prediction API would give you the file info containing the predictions.

URI

POST  /api/v1/ml-service/phoenix-ml/regression/predict

Headers

api-key Your App's API Key

Request Example

{
  "library": "weka",
  "service": "regression",
  "config": {
    "datasetUrl": "/spotflock-studio/library/hp_test.csv",
   "modelUrl":"/spotflock-studio/1/1550423221357-LinearRegression.mdl",
       "params":{

       }
    }
}

Response

{
    "code": 0,
    "data": {
        "jobId": 972,
        "appId": 1560322200284,
        "name": "weka_regression_predict",
        "library": "weka",
        "service": "Regression",
        "task": "PREDICT",
        "state": "RUN",
        "startTime": "2019-06-21T04:33:30.418+0000",
        "endTime": null,
        "request": {
            "library": "scikit",
            "config": {
                "modelUrl": "/spotflock-studio/1/1550423221357-LinearRegression.mdl",
                "datasetUrl": "/spotflock-studio/library/hp_test.csv",
                "features":  ["area","parking_area"]
            }
        },
        "isStreamJob": false,
        "isJobStopped": null
    }
}

Train on new dataset with already built model from a Regression Model

Description

This API would enable you to train on already built Regression model with new dataset with same features and algorithm. Once the job is completed, the job output API would give you the model info.

URI

POST  /api/v1/ml-service/phoenix-ml/regression/feedback

Headers

api-key Your App's API Key

Request Example

{
	"library":"weka",
	"service":"regression",
	"task":"FEEDBACK",
	"config":{
		"name":"Housing Price Model",
		"algorithm":"LinearRegression",
		"datasetUrl": "/spotflock-studio/library/hp_train.csv",
		"modelUrl": "/spotflock-studio/1/1550423221357-LinearRegression.mdl",
		"feedbackDatasetUrl":"/spotflock-studio/library/hp_feedback.csv",
		"features":["area","parking_area"],
		"trainPercentage": 80,
		"label": "price",
		"saveModel":true,
		"params":{}
	}
}

Response

{
    "code": 0,
    "data": {
        "jobId": 974,
        "appId": 1560322200284,
        "name": "weka_regression_feedback",
        "library": "weka",
        "service": "Regression",
        "task": "FEEDBACK",
        "state": "RUN",
        "startTime": "2019-06-21T04:37:47.999+0000",
        "endTime": null,
        "request": {
            "library": "h2o",
            "config": {
                "name": "Housing Price Model",
                "label": "price",
                "params": {},
                "features": ["area","parking_area"],
                "algorithm": "LinearRegression",
                "saveModel": "true",
                "datasetUrl": "/spotflock-studio/library/hp_train.csv",
                "trainPercentage": 80,
                "feedbackDatasetUrl": "/spotflock-studio/library/hp_feedback.csv",
                "modelUrl": "/spotflock-studio/1/1550423221357-LinearRegression.mdl"
            }
        },
        "isStreamJob": false,
        "isJobStopped": null
    }
}

Cluster Model

Description

This API would enable you to cluster a dataset. Clustering would take some time to be completed and thus the job status has to be checked. Once the job is completed, the job output API would give you the model info.

URI

POST  /api/v1/ml-service/phoenix-ml/cluster

Headers

api-key Your App's API Key

Attributes

text Text Sentence (String)

Request Example

{
"library":"weka",
"service":"Clustering",
"task":"CLUSTER",
"config":{
	"name":"Clustering",
	"algorithm":"KMeansClustering",
	"datasetUrl":"/spotflock-studio/library/moon_data.csv",
	"numOfClusters": 2,
	"saveModel": "True",
	"params":{},
	"features":["X","Y"]
	}
}

Response

{
    "code": 200,
    "data": {
        "jobId": 968,
        "appId": 1558586024244,
        "name": "weka_clustering_cluster",
        "library": "weka",
        "service": "Clustering",
        "task": "CLUSTER",
        "state": "RUN",
        "startTime": "2019-06-21T04:28:12.116+0000",
        "endTime": null,
        "request": {
            "library": "weka",
            "config": {
                "name": "Clustering",
                "algorithm": "KMeansClustering",
                "datasetUrl": "/spotflock-studio/library/moon_data.csv",
                "numOfClusters": 2,
                "saveModel": "True",
                "params": {},
                "features": [
                    "X",
                    "Y"
                ]
            }
        },
        "isStreamJob": false,
        "isJobStopped": null
    }
}

Get Job Status

Description

The train/predict jobs take some amount of time to be completed and so their status can be checked with this API.

URI

GET  /api/v1/ml-service/phoenix-ml/job/status?id={id}

Headers

api-key Your App's API Key

Attributes

None None

Response

{
    "id": 21,
    "name": "Player Churn Model",
    "library": "spotflock",
    "service": "Classification",
    "task": "PREDICT",
    "state": "FINISH",
    "startTime": "2019-02-17T18:25:19.587+0000",
    "endTime": "2019-02-17T18:25:24.583+0000",
    "msg": null,
    "request": {
        "library": "spotflock",
        "config": {
            "params": {},
            "modelUrl": "/spotflock-studio/1/1550427728251-NaiveBayesMultinomial_5044073238607802124mdl",
            "datasetUrl": "/spotflock-studio/library/rg_test.csv"
        }
    }
}

Get Job Output

Description

Once the job status is completed, the job output can be retrieved from this API.

URI

GET  /api/v1/ml-service/phoenix-ml/output/findBy?jobId={id}

Headers

api-key Your App's API Key

Attributes

None None

Response

{
    "id": 9,
    "jobId": 20,
    "state": null,
    "output": {
        "eval": {
            "kappa": -0.05913503971756384,
            "recall": {
                "Active": 0.5723684210526315,
                "Churned": 0.3541666666666667
            },
            "correct": 104,
            "accuracy": 52,
            "revision": "14755",
            "rocCurve": {
                "values": [
                    [
                        1,
                        1
                    ],
                    [
                        0.8958,
                        0.7368
                    ],

                ]
            },
            "errorRate": 0.48,
            "inCorrect": 96,
            "precision": {
                "Active": 0.7372881355932204,
                "Churned": 0.2073170731707317
            },
            "areaUnderPRC": {
                "Active": 0.7848942279681246,
                "Churned": 0.237283172269615
            },
            "areaUnderROC": {
                "Active": 0.49506578947368424,
                "Churned": 0.518297697368421
            },
            "priorEntropy": 0.7986194718732207,
            "confusionMatrix": [
                [
                    17,
                    31
                ],
                [
                    65,
                    87
                ]
            ],
            "numTrueNegatives": {
                "Active": 17,
                "Churned": 87
            },
            "numTruePositives": {
                "Active": 87,
                "Churned": 17
            },
            "trueNegativeRate": {
                "Active": 0.3541666666666667,
                "Churned": 0.5723684210526315
            },
            "truePositiveRate": {
                "Active": 0.5723684210526315,
                "Churned": 0.3541666666666667
            },
            "falseNegativeRate": {
                "Active": 0.4276315789473684,
                "Churned": 0.6458333333333334
            },
            "falsePositiveRate": {
                "Active": 0.6458333333333334,
                "Churned": 0.4276315789473684
            },
            "numFalseNegatives": {
                "Active": 65,
                "Churned": 31
            },
            "numFalsePositives": {
                "Active": 31,
                "Churned": 65
            },
            "pearsonCorrelation": {
                "challenges": 0.24937135217246517,
                "achievements": 0.18263960513415353,
                "stamina": 0.2493238592388467
            },
            "confusionMatrixHeaders": [
                "Churned",
                "Active"
            ],
            "correlationCoefficient": 0,
            "mathewsCorrelationCoefficient": {
                "Active": -0.06379320872133686,
                "Churned": -0.06379320872133686
            }
        },
        "modelUrl": "/spotflock-studio/1/1550427728251-NaiveBayesMultinomial_5044073238607802124mdl"
    }
}

Data Usage FAQ


Apart from the pricing plans for Phoenix ML where you consume by API calls, you can also use SCUs to consume Phoenix ML APIs. Below are the SCUs consumed for each service.

APIs SCUs Consumed
Train/Predict/Feedback a Classification Model 10
Train/Predict/Feedback a Regression Model 10
Cluster Model 7
Get Job Status 0
Get Job Output 0

Release Notes


Folowing are the release notes as part of Release 2.0.0

  • Algorithms supported under classification are Logistic, MultilayerPerceptron, NaiveBayesMultinomial, RandomForest, LibSVM, AdaBoostM1, AttributeSelectedClassifier, Bagging, CostSensitiveClassifier, DecisionTable, GaussianProcesses, IBk, RandomTree, SMO.

  • Algorithms supported under Regression are LinearRegression, AdditiveRegression.

  • Feedback dataset should contain the same features as original dataset.

  • Max train file upload size is 100 MB.

  • Max test file upload size is 50 MB.

  • Max no. of features selected for training cannot be more than 20.

Helpdesk