Overview
Intelli ML makes machine learning, and AI more accessible and more performant. Use this software platform for building custom models on the cloud. Enterprises can rapidly deploy instances fully optimized with blazing performance. The Intelli ML platform includes the most popular machine learning frameworks and their dependencies, and it is built for easy and rapid deployment. Users can also learning Machine Learning in the platform itself and become a professional while using the platform to build ML solutions.
Beginner's Guide
Artificial intelligence will shape our future more powerfully than any other innovation this century. Anyone who does not understand it will soon find themselves feeling left behind, waking up in a world full of technology that feels more and more like magic.
Machine Learning (ML) is defined as the use algorithms and computational statistics to learn from data without being explicitly programmed. It is a subsection of the artificial intelligence domain within computer science. While the field of machine learning did not explode until more recently, the term was first coined in 1959 and the most foundational research was done throughout the 70’s and 80's. Machine learning’s rise to prominence today has been enabled by the abundance of data, more efficient data storage, and faster computers.
Terminologies
Training Data : The data you use for training your model. It contains all the information you have collected about the problem statement.
Test Data : The data you use for testing the model. You can make predictions on this data.
Features : These are all the columns in your dataset which you use for training your model.
Class/Label : This is the column which identifies the particular record and the one you want to predict.
Accuracy : This is the percentage of data correctly predicted when you apply the model to your data.
Quickstart
Following : 3 services are provided by Phoenix ML as part of Release 1.2.0.
Classification : Classify entities into binary or multi classes.
Regression : Predict a value from a continuous range.
Feedback : Training a model with new set od data.
Clustering : Cluster your dataset based on yout business needs.
APIs
All URIs below are relative to https://studio.spotflock.com
Train a Classification Model | POST /api/v1/ml-service/phoenix-ml/classification/train |
Train a Regression Model | POST /api/v1/ml-service/phoenix-ml/regression/train |
Predicting from Classification Model | POST /api/v1/ml-service/phoenix-ml/classification/predict |
Predicting from Regression Model | POST /api/v1/ml-service/phoenix-ml/regression/predict |
Feedback from Regression Model | POST /api/v1/ml-service/phoenix-ml/regression/feedback |
Feedback from Classification Model | POST /api/v1/ml-service/phoenix-ml/classification/feedback |
Cluster Model | POST /api/v1/ml-service/phoenix-ml/cluster |
Get Job Status | GET /api/v1/ml-service/phoenix-ml/job/status?id={id}
|
Get Job Output | GET
/api/v1/ml-service/phoenix-ml/output/findBy?jobId={id}
|
Train a Classification Model
Description
This API would enable you to train a classification model. The model takes some time to be trained and thus the job status has to be checked. Once the job is completed, the job output API would give you the model info.
URI
POST
/api/v1/ml-service/phoenix-ml/classification/train
Headers
api-key | Your App's API Key |
Attributes
library | intellihub |
service | classification |
task | train |
config.name | Name of the model |
config.algorithm | Name of the algorithm. See the list of algorihtms below. |
config.datasetUrl | Path of the train data after uploading to cloud storage. See here for more Info. |
config.label | Column name to be predicted. |
config.features | List of column names for training. |
config.trainPercentage | Percentage of data used for training.Rest gets used for evaluating the model. |
config.saveModel | true / false |
config.params | Any configurations required for libraries. |
Request Example
{
"library": "weka",
"service": "classification",
"task": "train",
"config": {
"name": "Player Churn Model",
"algorithm": "NaiveBayesBinomial",
"datasetUrl": "/spotflock-studio/library/player_train.csv",
"label": "player_activity",
"trainPercentage": 80,
"features": ["stamina","challenges","achievements"],
"saveModel": "true",
"params": {}
}
}
Response
{
"code": 200,
"data": {
"jobId": 969,
"appId": 1558586024244,
"name": "weka_classification_train",
"library": "weka",
"service": "Classification",
"task": "TRAIN",
"state": "RUN",
"startTime": "2019-06-21T04:30:54.283+0000",
"endTime": null,
"request": {
"library": "weka",
"config": {
"name": "Player Churn Model",
"algorithm": "NaiveBayesBinomial",
"datasetUrl": "/spotflock-studio/library/player_train.csv",
"label": "player_activity.Grid",
"trainPercentage": 80,
"saveModel": "true",
"params": {},
"features": ["stamina","challenges","achievements"]
}
},
"isStreamJob": false,
"isJobStopped": null
}
}
Predicting from a Classification Model
Description
This API would enable you to predict a classification model. Once the job is completed, the prediction output API would give you the file info from which you can get the predictions.
URI
POST
/api/v1/ml-service/phoenix-ml/classification/predict
Headers
api-key | Your App's API Key |
Request Example
{
"library": "weka",
"service": "classification",
"config": {
"datasetUrl": "/spotflock-studio/library/player_test.csv",
"modelUrl":"/spotflock-studio/1/1550423221357-NaiveBayesMultinomial.mdl",
"params":{
}
}
}
Response
{
"code": 200,
"data": {
"jobId": 970,
"appId": 1560322200284,
"name": "weka_classification_predict",
"library": "weka",
"service": "Regression",
"task": "PREDICT",
"state": "RUN",
"startTime": "2019-06-21T04:33:30.418+0000",
"endTime": null,
"request": {
"library": "weka",
"config": {
"modelUrl": "/spotflock-studio/1/1550423221357-NaiveBayesMultinomial.mdl",
"datasetUrl": "/spotflock-studio/library/player_test.csv",
"features": ["stamina","challenges","achievements"]
}
},
"isStreamJob": false,
"isJobStopped": null
}
}
Train on new dataset with already built model from a Classification Model
Description
This API would enable you to train on already built classification model with new dataset with same features and algorithm. Once the job is completed, the job output API would give you the model info.
URI
POST
/api/v1/ml-service/phoenix-ml/classification/feedback
Headers
api-key | Your App's API Key |
Request Example
{
"library":"weka",
"service":"Classification",
"task":"FEEDBACK",
"config":{
"name":"Player Churn Model",
"algorithm":"NaiveBayesBinomial",
"datasetUrl": "/spotflock-studio/library/player_feedback.csv",
"modelUrl": "/spotflock-studio/library/1550423221357-NaiveBayesMultinomial.mdl",
"feedbackDatasetUrl":"/spotflock-studio/library/player_feedback.csv",
"features":["stamina","challenges","achievements"],
"trainPercentage": 80,
"label": "player_activity",
"saveModel":true,
"params":{}
}
}
Response
{
"code": 0,
"data": {
"jobId": 971,
"appId": 1560322200284,
"name": "weka_classification_feedback",
"library": "h2o",
"service": "Classification",
"task": "FEEDBACK",
"state": "RUN",
"startTime": "2019-06-21T04:37:47.999+0000",
"endTime": null,
"request": {
"library": "weka",
"config": {
"name": "Player Churn Model",
"label": "player_activity",
"params": {},
"features": ["stamina","challenges","achievements"],
"algorithm": "NaiveBayesBinomial",
"saveModel": "true",
"datasetUrl": "/spotflock-studio/library/player_feedback.csv",
"trainPercentage": 80,
"feedbackDatasetUrl": "/spotflock-studio/library/player_feedback.csv",
"modelUrl": "/spotflock-studio/library/1550423221357-NaiveBayesMultinomial.mdl"
}
},
"isStreamJob": false,
"isJobStopped": null
}
}
Train a Regression Model
Description
This API would enable you to train a regression model. The model takes some time to be trained and thus the job status has to be checked. Once the job is completed, the job output API would give you the model info.
URI
POST
/api/v1/ml-service/phoenix-ml/regression/train
Headers
api-key | Your App's API Key |
Attributes
library | spotflock / weka/ scikit/ h2o |
service | regression |
task | train |
config.name | Name of the model |
config.algorithm | Name of the algorithm. See the list of algorihtms below. |
config.datasetUrl | Path of the train data after uploading to cloud storage. See here for more Info. |
config.label | Column name to be predicted. |
config.features | List of column names for training. |
config.trainPercentage | Percentage of data used for training.Rest gets used for evaluating the model. |
config.saveModel | true / false |
config.params | Any configurations required for libraries. |
Request Example
{
"library": "spotflock",
"service": "regression",
"task": "train",
"config": {
"name": "Housing Price Model",
"algorithm": "LinearRegression",
"datasetUrl": "/spotflock-studio/library/hp_train.csv",
"label": "price",
"trainPercentage": 80,
"features": ["area","parking_area"],
"saveModel": "true",
"params": {}
}
}
Response
{
"code": 200,
"data": {
"jobId": 971,
"appId": 1558586024244,
"name": "weka_regression_train",
"library": "weka",
"service": "Regression",
"task": "TRAIN",
"state": "RUN",
"startTime": "2019-06-21T04:30:54.283+0000",
"endTime": null,
"request": {
"library": "weka",
"config": {
"name": "Housing Price Model",
"algorithm": "LinearRegression",
"datasetUrl": "/spotflock-studio/library/hp_train.csv",
"label": "price",
"trainPercentage": 80,
"saveModel": "true",
"params": {},
"features": ["area","parking_area"]
}
},
"isStreamJob": false,
"isJobStopped": null
}
}
Predicting from Regression Model
Description
This API would enable you to get predictions from regression model. Once the job is completed, prediction API would give you the file info containing the predictions.
URI
POST
/api/v1/ml-service/phoenix-ml/regression/predict
Headers
api-key | Your App's API Key |
Request Example
{
"library": "weka",
"service": "regression",
"config": {
"datasetUrl": "/spotflock-studio/library/hp_test.csv",
"modelUrl":"/spotflock-studio/1/1550423221357-LinearRegression.mdl",
"params":{
}
}
}
Response
{
"code": 0,
"data": {
"jobId": 972,
"appId": 1560322200284,
"name": "weka_regression_predict",
"library": "weka",
"service": "Regression",
"task": "PREDICT",
"state": "RUN",
"startTime": "2019-06-21T04:33:30.418+0000",
"endTime": null,
"request": {
"library": "scikit",
"config": {
"modelUrl": "/spotflock-studio/1/1550423221357-LinearRegression.mdl",
"datasetUrl": "/spotflock-studio/library/hp_test.csv",
"features": ["area","parking_area"]
}
},
"isStreamJob": false,
"isJobStopped": null
}
}
Train on new dataset with already built model from a Regression Model
Description
This API would enable you to train on already built Regression model with new dataset with same features and algorithm. Once the job is completed, the job output API would give you the model info.
URI
POST
/api/v1/ml-service/phoenix-ml/regression/feedback
Headers
api-key | Your App's API Key |
Request Example
{
"library":"weka",
"service":"regression",
"task":"FEEDBACK",
"config":{
"name":"Housing Price Model",
"algorithm":"LinearRegression",
"datasetUrl": "/spotflock-studio/library/hp_train.csv",
"modelUrl": "/spotflock-studio/1/1550423221357-LinearRegression.mdl",
"feedbackDatasetUrl":"/spotflock-studio/library/hp_feedback.csv",
"features":["area","parking_area"],
"trainPercentage": 80,
"label": "price",
"saveModel":true,
"params":{}
}
}
Response
{
"code": 0,
"data": {
"jobId": 974,
"appId": 1560322200284,
"name": "weka_regression_feedback",
"library": "weka",
"service": "Regression",
"task": "FEEDBACK",
"state": "RUN",
"startTime": "2019-06-21T04:37:47.999+0000",
"endTime": null,
"request": {
"library": "h2o",
"config": {
"name": "Housing Price Model",
"label": "price",
"params": {},
"features": ["area","parking_area"],
"algorithm": "LinearRegression",
"saveModel": "true",
"datasetUrl": "/spotflock-studio/library/hp_train.csv",
"trainPercentage": 80,
"feedbackDatasetUrl": "/spotflock-studio/library/hp_feedback.csv",
"modelUrl": "/spotflock-studio/1/1550423221357-LinearRegression.mdl"
}
},
"isStreamJob": false,
"isJobStopped": null
}
}
Cluster Model
Description
This API would enable you to cluster a dataset. Clustering would take some time to be completed and thus the job status has to be checked. Once the job is completed, the job output API would give you the model info.
URI
POST
/api/v1/ml-service/phoenix-ml/cluster
Headers
api-key | Your App's API Key |
Attributes
text | Text Sentence (String) |
Request Example
{
"library":"weka",
"service":"Clustering",
"task":"CLUSTER",
"config":{
"name":"Clustering",
"algorithm":"KMeansClustering",
"datasetUrl":"/spotflock-studio/library/moon_data.csv",
"numOfClusters": 2,
"saveModel": "True",
"params":{},
"features":["X","Y"]
}
}
Response
{
"code": 200,
"data": {
"jobId": 968,
"appId": 1558586024244,
"name": "weka_clustering_cluster",
"library": "weka",
"service": "Clustering",
"task": "CLUSTER",
"state": "RUN",
"startTime": "2019-06-21T04:28:12.116+0000",
"endTime": null,
"request": {
"library": "weka",
"config": {
"name": "Clustering",
"algorithm": "KMeansClustering",
"datasetUrl": "/spotflock-studio/library/moon_data.csv",
"numOfClusters": 2,
"saveModel": "True",
"params": {},
"features": [
"X",
"Y"
]
}
},
"isStreamJob": false,
"isJobStopped": null
}
}
Get Job Status
Description
The train/predict jobs take some amount of time to be completed and so their status can be checked with this API.
URI
GET
/api/v1/ml-service/phoenix-ml/job/status?id={id}
Headers
api-key | Your App's API Key |
Attributes
None | None |
Response
{
"id": 21,
"name": "Player Churn Model",
"library": "spotflock",
"service": "Classification",
"task": "PREDICT",
"state": "FINISH",
"startTime": "2019-02-17T18:25:19.587+0000",
"endTime": "2019-02-17T18:25:24.583+0000",
"msg": null,
"request": {
"library": "spotflock",
"config": {
"params": {},
"modelUrl": "/spotflock-studio/1/1550427728251-NaiveBayesMultinomial_5044073238607802124mdl",
"datasetUrl": "/spotflock-studio/library/rg_test.csv"
}
}
}
Get Job Output
Description
Once the job status is completed, the job output can be retrieved from this API.
URI
GET
/api/v1/ml-service/phoenix-ml/output/findBy?jobId={id}
Headers
api-key | Your App's API Key |
Attributes
None | None |
Response
{
"id": 9,
"jobId": 20,
"state": null,
"output": {
"eval": {
"kappa": -0.05913503971756384,
"recall": {
"Active": 0.5723684210526315,
"Churned": 0.3541666666666667
},
"correct": 104,
"accuracy": 52,
"revision": "14755",
"rocCurve": {
"values": [
[
1,
1
],
[
0.8958,
0.7368
],
]
},
"errorRate": 0.48,
"inCorrect": 96,
"precision": {
"Active": 0.7372881355932204,
"Churned": 0.2073170731707317
},
"areaUnderPRC": {
"Active": 0.7848942279681246,
"Churned": 0.237283172269615
},
"areaUnderROC": {
"Active": 0.49506578947368424,
"Churned": 0.518297697368421
},
"priorEntropy": 0.7986194718732207,
"confusionMatrix": [
[
17,
31
],
[
65,
87
]
],
"numTrueNegatives": {
"Active": 17,
"Churned": 87
},
"numTruePositives": {
"Active": 87,
"Churned": 17
},
"trueNegativeRate": {
"Active": 0.3541666666666667,
"Churned": 0.5723684210526315
},
"truePositiveRate": {
"Active": 0.5723684210526315,
"Churned": 0.3541666666666667
},
"falseNegativeRate": {
"Active": 0.4276315789473684,
"Churned": 0.6458333333333334
},
"falsePositiveRate": {
"Active": 0.6458333333333334,
"Churned": 0.4276315789473684
},
"numFalseNegatives": {
"Active": 65,
"Churned": 31
},
"numFalsePositives": {
"Active": 31,
"Churned": 65
},
"pearsonCorrelation": {
"challenges": 0.24937135217246517,
"achievements": 0.18263960513415353,
"stamina": 0.2493238592388467
},
"confusionMatrixHeaders": [
"Churned",
"Active"
],
"correlationCoefficient": 0,
"mathewsCorrelationCoefficient": {
"Active": -0.06379320872133686,
"Churned": -0.06379320872133686
}
},
"modelUrl": "/spotflock-studio/1/1550427728251-NaiveBayesMultinomial_5044073238607802124mdl"
}
}
Data Usage FAQ
Apart from the pricing plans for Phoenix ML where you consume by API calls, you can also use SCUs to consume Phoenix ML APIs. Below are the SCUs consumed for each service.
APIs | SCUs Consumed |
Train/Predict/Feedback a Classification Model | 10 |
Train/Predict/Feedback a Regression Model | 10 |
Cluster Model | 7 |
Get Job Status | 0 |
Get Job Output | 0 |
Release Notes
Folowing are the release notes as part of Release 2.0.0
Algorithms supported under classification are Logistic, MultilayerPerceptron, NaiveBayesMultinomial, RandomForest, LibSVM, AdaBoostM1, AttributeSelectedClassifier, Bagging, CostSensitiveClassifier, DecisionTable, GaussianProcesses, IBk, RandomTree, SMO.
Algorithms supported under Regression are LinearRegression, AdditiveRegression.
Feedback dataset should contain the same features as original dataset.
Max train file upload size is 100 MB.
Max test file upload size is 50 MB.
Max no. of features selected for training cannot be more than 20.