Overview
Data Dockyard is a Big data platform which can ingest, transform, enrich and put data. View information and events as they unfold. Data Dockyard enables you to securely ingest data from any source and search, analyze, and visualize it in real time. Query from multiple data types and sources. One can configure sources and integrate with their application seamlessly to unlock new insights.
Beginner's Guide
"Big data" is a field that treats ways to analyze, systematically extract information from, or otherwise deal with data sets that are too large or complex to be dealt with by traditional data-processing application software.
Terminologies
Ingester : The module for ingesting data from various data sources.
Transformer : The module for transforming ingested data into any format based on your requirement.
Sink : The module which allows multiple destination sources to dump your data.
Process Engine : The module which allows orchestration of data using the above modules.
Quickstart
As part of Release 2.0.0 of Data Dockyard, the module enables you to securely ingest data from any source and search, analyze, and visualize it in real time and query from multiple data types and sources.
APIs
All URIs below are relative to https://studio.spotflock.com
Create a Stream | POST /api/v1/process-service/darwin-process/process-stream |
Configure DAPI Events | POST /api/v1/ingester-service/darwin-ingester/ingester/events/add |
Configure Ingester | POST /api/v1/ingester-service/darwin-ingester/config/add |
Configure Transformer | POST /api/v1/transformer-service/darwin-transformer/transform |
Configure Sink | POST /api/v1/sink-service/darwin-sink/sink/config/status |
Sink Status | GET /api/v1/sink-service/darwin-sink/sink/config/status/all |
Create Process Definition | POST /api/v1/process-service/darwin-process/process-definition |
Schedule Process Execution | POST /api/v1/process-service/darwin-process/process-execution/schedule |
Start Process Execution | POST /api/v1/process-service/darwin-process/process-execution/start |
Stop Process Execution | POST /api/v1/process-service/darwin-process/process-execution/stop |
Unschedule Process Execution | POST /api/v1/process-service/darwin-process/process-execution/unschedule |
Query Data | POST /api/v1/process-service/darwin-process/process-sink |
Create a Stream
Description
This API would enable you to create a stream.
URI
POST
/api/v1/process-service/darwin-process/process-stream
Headers
api-key | Your App's API Key |
Attributes
name | Name of the Stream |
description | Description of the stream. |
Request Example
{
"name": "GoFlocker Stream",
"description": "Stream for the GoFlocker Module"
}
Response
{
"id": "a239-hdu2-3d43",
"name": "GoFlocker Stream",
"description": "Stream for the GoFlocker Module"
}
Configure DAPI Events
Description
This API enables you to configure the DAPI Events Metadata.
URI
POST
/api/v1/ingester-service/darwin-ingester/ingester/events/add
Headers
api-key | Your App's API Key |
Attributes
{0}.field | Event Name |
{0}.type | Field Type( text/Integer/Long/Float) |
{0}.description | Field Description |
{0}.eventId | Field Identifier |
Request Example
[
{
"eventId": 14275,
"field": "session",
"type": "Integer",
"operation": "AVG",
"description": "Session Length of a user"
},
{
"eventId": 14726,
"field": "achievements",
"type": "Integer",
"operation": "AVG",
"description": "Achievements of a user"
}
]
Response
[{
"eventId": 14275,
"field": "session",
"type": "Integer",
"operation": "AVG",
"description": "Session Length of a user"
},
{
"eventId": 14726,
"field": "achievements",
"type": "Integer",
"operation": "AVG",
"description": "Achievements of a user"
}
]
Configure Ingester
Description
This API would enable you to configure all the sources from which you want to ingest data.
URI
POST
/api/v1/ingester-service/darwin-ingester/config/add
Headers
api-key | Your App's API Key |
Attributes
source | Source to ingest from (TWIITER/FACEBOOK/DAPI/INSTAGRAM/YOUTUBE) |
configData | Metadata and properties of the source |
Request Example
{
"source": "TWITTER",
"configData": {"tags": ['#ai','#spotflock']},
}
Response
{
"id": "92a9-92u2-3243",
"config": {"TWITTER": {"tags": ['#ai','#spotflock']}},
"createdDate": "2019-02-05T16:14:31.467+0000"
}
Configure Transformer
Description
This API would enable you to create a transformer to transform your ingested data to a particular format. It uses Jolt transformer mechanism to transform data.
URI
POST
/api/v1/transformer-service/darwin-transformer/transform
Headers
api-key | Your App's API Key |
name | Name of the transformer |
description | Description of the transformer |
transformer | Jolt Transformer Object |
Request Example
{
"name": "Twitter Data Transformer",
"description": "Twitter Data Transformer",
"transformer": {
"schema": [
{
"operation": "default",
"spec": {
"Range": 5,
"SecondaryRatings": {
"*": {
// Defaut all "SecondaryRatings" to have a Range of 5
"Range": 5
}
}
}
}
]
}
}
Response
{
"id": "43253x24234"
"name": "Twitter Data Transformer",
"description": "Twitter Data Transformer",
"transformer": {
"schema": [
{
"operation": "default",
"spec": {
"Range": 5,
"SecondaryRatings": {
"*": {
// Defaut all "SecondaryRatings" to have a Range of 5
"Range": 5
}
}
}
}
]
}
}
Configure Sink
Description
The API enables you to configure ths sink destinations to which you want your data to be dumped.
URI
POST
/api/v1/sink-service/darwin-sink/sink/config/status
Headers
api-key | Your App's API Key |
Attributes
destination | Destination to dump data(POSTGRES/INFLUXDB) |
isEnabled | True or False |
Request
{
"destination":"POSTGRES",
"isEnabled":true
}
Configure Sink
Description
The API enables you to get all the enabled sinks for that particular app.
URI
GET
/api/v1/sink-service/darwin-sink/sink/config/status/all
Headers
api-key | Your App's API Key |
Attributes
Request
Response
[
{
"name": "POSTGRES",
"status": true
}
]
Create Process Definition
Description
This API enables you to create a process definition
URI
POST
/api/v1/process-service/darwin-process/process-definition
Headers
api-key | Your App's API Key |
Attributes
name | Name of the Process Definition |
description | Description of the Process Definition |
repeat | True or False based on if repeating |
repeatInterval | Interval of repeating |
workflow | List of nodes |
Request
{
"name" : "Goflocker DDSL",
"description" : "Test version 1.0.0",
"repeat" : true,
"repeatInterval" : "24h",
"from" : "",
"to" : "",
"workflow":
[
{
"type" : "INGESTER",
"metadata":{
"sources": [
{
"source" : "TWITTER",
"subtype": "TIMELINE"
}
]
}
},
{
"type" : "TRANSFORMER",
"metadata" : {
"transformers": [
{
"source": "TWITTER",
"transformer":
{
"id": 1
}
},
{
"source": "FACEBOOK",
"transformer":
{
"id": 2
}
}
]
}
},
{
"type" : "SINK",
"metadata":{
"destinations":
[
{
"destination": "INFLUXDB"
}
]
}
}
]
}
Schedule Process Execution
Description
The API enables you to schedule the process defined in process definition API.
URI
POST
/api/v1/process-service/darwin-process/process-execution/schedule
Headers
api-key | Your App's API Key |
Response
{
"status":"200"
}
Start Process Execution
Description
The API enables you to start the process defined in process definition API.
URI
POST
/api/v1/process-service/darwin-process/process-execution/start
Headers
api-key | Your App's API Key |
Response
{
"status":"200"
}
Stop Process Execution
Description
The API enables you to stop the process defined in process definition API.
URI
POST
/api/v1/process-service/darwin-process/process-execution/stop
Headers
api-key | Your App's API Key |
Response
{
"status":"200"
}
Unschedule Process Execution
Description
The API enables you to unschedule the process defined in process definition API.
URI
POST
/api/v1/process-service/darwin-process/process-execution/unschedule
Headers
api-key | Your App's API Key |
Response
{
"status":"200"
}
Query Data
Description
The API enables you to query data from your sink.
URI
POST
/api/v1/process-service/darwin-process/process-sink
Headers
api-key | Your App's API Key |
Attributes
query | SELECT * FROM TABLE |
destination | Configured Destinations(POSTGRES or INFLUXDB) |
Request
{
"query":"SELECT * FROM TABLE",
"destination":"POSTGRES"
}
Response
{
"0": [
"name",
"age",
"weight",
"timestamp"
],
"1": [
"sd",
34,
45,
"2019-05-24"
]
}
Release Notes
Folowing are the release notes as part of Release 2.0.0
Ingester supports Twitter, Facebook, Instagram, YouTube and DAPI Events.
Transformer is based on Jolt Transformer.
Sink supports PostgreSQL and InfluxDB.