Overview


Data Dockyard is a Big data platform which can ingest, transform, enrich and put data. View information and events as they unfold. Data Dockyard enables you to securely ingest data from any source and search, analyze, and visualize it in real time. Query from multiple data types and sources. One can configure sources and integrate with their application seamlessly to unlock new insights.


Beginner's Guide


"Big data" is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software.

Terminologies

Ingester : The module for ingesting data from various data sources.

Transformer : The module for transforming ingested data into any format based on your requirement.

Sink : The module which allows multiple destination sources to dump your data.

Process Engine : The module which allows orchestration of data using the above modules.


Quickstart


As part of Release 2.0.0 of Data Dockyard, the module enables you to securely ingest data from any source and search, analyze, and visualize it in real time and query from multiple data types and sources.


APIs


All URIs below are relative to https://studio.spotflock.com

Create a Stream POST  /api/v1/process-service/darwin-process/process-stream
Configure DAPI Events POST  /api/v1/ingester-service/darwin-ingester/ingester/events/add
Configure Ingester POST  /api/v1/ingester-service/darwin-ingester/config/add
Configure Transformer POST  /api/v1/transformer-service/darwin-transformer/transform
Configure Sink POST    /api/v1/sink-service/darwin-sink/sink/config/status
Sink Status GET    /api/v1/sink-service/darwin-sink/sink/config/status/all
Create Process Definition POST /api/v1/process-service/darwin-process/process-definition
Schedule Process Execution POST /api/v1/process-service/darwin-process/process-execution/schedule
Start Process Execution POST /api/v1/process-service/darwin-process/process-execution/start
Stop Process Execution POST /api/v1/process-service/darwin-process/process-execution/stop
Unschedule Process Execution POST /api/v1/process-service/darwin-process/process-execution/unschedule
Query Data POST /api/v1/process-service/darwin-process/process-sink

Create a Stream

Description

This API would enable you to create a stream.

URI

POST  /api/v1/process-service/darwin-process/process-stream

Headers

api-key Your App's API Key

Attributes

name Name of the Stream
description Description of the stream.

Request Example

{
  "name": "GoFlocker Stream",
  "description": "Stream for the GoFlocker Module"
}

Response

{
    "id": "a239-hdu2-3d43",
    "name": "GoFlocker Stream",
    "description": "Stream for the GoFlocker Module"
}

Configure DAPI Events

Description

This API enables you to configure the DAPI Events Metadata.

URI

POST  /api/v1/ingester-service/darwin-ingester/ingester/events/add

Headers

api-key Your App's API Key

Attributes

{0}.field Event Name
{0}.type Field Type( text/Integer/Long/Float)
{0}.description Field Description
{0}.eventId Field Identifier

Request Example

[
{
  "eventId": 14275,
  "field": "session",
  "type": "Integer",
  "operation": "AVG",
  "description": "Session Length of a user"
},
{
  "eventId": 14726,
  "field": "achievements",
  "type": "Integer",
  "operation": "AVG",
  "description": "Achievements of a user"
}
]

Response


[{
  "eventId": 14275,
  "field": "session",
  "type": "Integer",
  "operation": "AVG",
  "description": "Session Length of a user"
},
{
  "eventId": 14726,
  "field": "achievements",
  "type": "Integer",
  "operation": "AVG",
  "description": "Achievements of a user"
}
]

Configure Ingester

Description

This API would enable you to configure all the sources from which you want to ingest data.

URI

POST  /api/v1/ingester-service/darwin-ingester/config/add

Headers

api-key Your App's API Key

Attributes

source Source to ingest from (TWIITER/FACEBOOK/DAPI/INSTAGRAM/YOUTUBE)
configData Metadata and properties of the source

Request Example

{
  "source": "TWITTER",
  "configData": {"tags": ['#ai','#spotflock']},
}

Response

{
    "id": "92a9-92u2-3243",
    "config": {"TWITTER": {"tags": ['#ai','#spotflock']}},
    "createdDate": "2019-02-05T16:14:31.467+0000"
}

Configure Transformer

Description

This API would enable you to create a transformer to transform your ingested data to a particular format. It uses Jolt transformer mechanism to transform data.

URI

POST  /api/v1/transformer-service/darwin-transformer/transform

Headers

api-key Your App's API Key
name Name of the transformer
description Description of the transformer
transformer Jolt Transformer Object

Request Example

{
  "name": "Twitter Data Transformer",
  "description": "Twitter Data Transformer",
  "transformer": {
     "schema": [
  {
    "operation": "default",
    "spec": {
      "Range": 5,
      "SecondaryRatings": {
        "*": {
          // Defaut all "SecondaryRatings" to have a Range of 5
          "Range": 5
        }
      }
    }
  }
]

    }
}

Response

{
  "id": "43253x24234"
  "name": "Twitter Data Transformer",
  "description": "Twitter Data Transformer",
  "transformer": {
     "schema": [
  {
    "operation": "default",
    "spec": {
      "Range": 5,
      "SecondaryRatings": {
        "*": {
          // Defaut all "SecondaryRatings" to have a Range of 5
          "Range": 5
        }
      }
    }
  }
]

    }
}

Configure Sink

Description

The API enables you to configure ths sink destinations to which you want your data to be dumped.

URI

POST  /api/v1/sink-service/darwin-sink/sink/config/status

Headers

api-key Your App's API Key

Attributes

destination Destination to dump data(POSTGRES/INFLUXDB)
isEnabled True or False

Request

{
	"destination":"POSTGRES",
	"isEnabled":true
}

Configure Sink

Description

The API enables you to get all the enabled sinks for that particular app.

URI

GET  /api/v1/sink-service/darwin-sink/sink/config/status/all

Headers

api-key Your App's API Key

Attributes

Request

Response

[
    {
        "name": "POSTGRES",
        "status": true
    }
]

Create Process Definition

Description

This API enables you to create a process definition

URI

POST  /api/v1/process-service/darwin-process/process-definition

Headers

api-key Your App's API Key

Attributes

name Name of the Process Definition
description Description of the Process Definition
repeat True or False based on if repeating
repeatInterval Interval of repeating
workflow List of nodes

Request

{
  "name" : "Goflocker DDSL",
  "description" : "Test version 1.0.0",
  "repeat" : true,
  "repeatInterval" : "24h",
  "from" : "",
  "to" : "",
  "workflow":
          [
            {
              "type" : "INGESTER",
              "metadata":{
                "sources": [
                  {
                    "source" : "TWITTER",
                    "subtype": "TIMELINE"
                  }
                ]
              }
            },
            {
              "type" : "TRANSFORMER",
              "metadata" : {
                "transformers": [
                  {
                    "source": "TWITTER",
                    "transformer":
                    {
                      "id": 1
                    }
                  },
                  {
                    "source": "FACEBOOK",
                    "transformer":
                    {
                      "id": 2
                    }
                  }
                ]
              }
            },
            {
              "type" : "SINK",
              "metadata":{
                "destinations":
                [
                  {
                    "destination": "INFLUXDB"
                  }
                ]
              }
            }
          ]
}

Schedule Process Execution

Description

The API enables you to schedule the process defined in process definition API.

URI

POST  /api/v1/process-service/darwin-process/process-execution/schedule

Headers

api-key Your App's API Key

Response

{
	"status":"200"
}

Start Process Execution

Description

The API enables you to start the process defined in process definition API.

URI

POST  /api/v1/process-service/darwin-process/process-execution/start

Headers

api-key Your App's API Key

Response

{
	"status":"200"
}

Stop Process Execution

Description

The API enables you to stop the process defined in process definition API.

URI

POST  /api/v1/process-service/darwin-process/process-execution/stop

Headers

api-key Your App's API Key

Response

{
	"status":"200"
}

Unschedule Process Execution

Description

The API enables you to unschedule the process defined in process definition API.

URI

POST  /api/v1/process-service/darwin-process/process-execution/unschedule

Headers

api-key Your App's API Key

Response

{
	"status":"200"
}

Query Data

Description

The API enables you to query data from your sink.

URI

POST  /api/v1/process-service/darwin-process/process-sink

Headers

api-key Your App's API Key

Attributes

query SELECT * FROM TABLE
destination Configured Destinations(POSTGRES or INFLUXDB)

Request

{
	"query":"SELECT * FROM TABLE",
	"destination":"POSTGRES"
}

Response

{
    "0": [
        "name",
        "age",
        "weight",
        "timestamp"
    ],
    "1": [
        "sd",
        34,
        45,
        "2019-05-24"
    ]
}


Release Notes


Folowing are the release notes as part of Release 2.0.0

  • Ingester supports Twitter, Facebook, Instagram, YouTube and DAPI Events.

  • Transformer is based on Jolt Transformer.

  • Sink supports PostgreSQL and InfluxDB.

Helpdesk