Real-Time Data API

Ingest and retrieve real-time machine data using CSV-based storage. Designed for high-volume industrial data collection with automatic buffering and efficient retrieval.

5 EndpointsCSV StorageTime-SeriesMinIO Backend

Overview

Data Flow

  • Ingestion: Push data points via REST API
  • Buffering: In-memory accumulation for efficiency
  • Storage: Daily CSV files in MinIO
  • Retrieval: Query by date or recent data

Features

  • • Automatic machine status updates
  • • Organization-isolated storage
  • • Date-partitioned CSV files
  • • Efficient bulk retrieval

Data Ingestion

Push real-time data points from PLCs, sensors, or other data sources. Data is automatically buffered and written to daily CSV files.

POST/v1/realtime-data/ingestAuth Required
Ingest Machine Data
Push data points for a machine. Updates machine connection status to 'connected' when data is received.

Request Body

Requestjson
{
  "machine_id": "550e8400-e29b-41d4-a716-446655440000",
  "data_points": [
    {
      "tag_name": "TE01_PV",
      "value": 45.2,
      "unit": "celsius"
    },
    {
      "tag_name": "PT02_PV",
      "value": 3.5,
      "unit": "bar"
    },
    {
      "tag_name": "FT03_PV",
      "value": 125.7,
      "unit": "lpm"
    }
  ],
  "timestamp": "2024-08-26T15:30:00Z"
}

Response

Responsejson
{
  "success": true,
  "message": "Successfully ingested 3 data points",
  "points_processed": 3
}

Try it out

cURLbash
curl -X POST "https://sapienstream.com/api/v1/realtime-data/ingest" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

Data Retrieval

Query historical data from CSV files. Data is organized by machine and date for efficient retrieval.

GET/v1/realtime-data/machines/{machine_id}/filesAuth Required
List Data Files
List all available CSV data files for a machine. Files are organized by date.

Parameters

machine_idstringRequired

Machine UUID

Response

Responsejson
{
  "machine_id": "550e8400-e29b-41d4-a716-446655440000",
  "files": [
    {
      "date": "2024-08-26",
      "filename": "2024-08-26.csv",
      "size_bytes": 125430,
      "row_count": 4320
    },
    {
      "date": "2024-08-25",
      "filename": "2024-08-25.csv",
      "size_bytes": 118920,
      "row_count": 4105
    }
  ]
}

Try it out

cURLbash
curl -X GET "https://sapienstream.com/api/v1/realtime-data/machines/{machine_id}/files" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"
GET/v1/realtime-data/machines/{machine_id}/data/{file_date}Auth Required
Get Data by Date
Retrieve all data points for a specific date. Returns CSV-formatted data.

Parameters

machine_idstringRequired

Machine UUID

file_datestringRequired

Date in YYYY-MM-DD format

Response

Responsejson
{
  "machine_id": "550e8400-e29b-41d4-a716-446655440000",
  "date": "2024-08-26",
  "data": [
    {
      "timestamp": "2024-08-26T00:00:00Z",
      "TE01_PV": 44.8,
      "PT02_PV": 3.4,
      "FT03_PV": 124.2
    },
    {
      "timestamp": "2024-08-26T00:01:00Z",
      "TE01_PV": 45.0,
      "PT02_PV": 3.5,
      "FT03_PV": 125.1
    }
  ],
  "row_count": 1440
}

Try it out

cURLbash
curl -X GET "https://sapienstream.com/api/v1/realtime-data/machines/{machine_id}/data/{file_date}" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"
GET/v1/realtime-data/machines/{machine_id}/recentAuth Required
Get Recent Data
Retrieve the most recent data points for a machine from the buffer or today's file.

Parameters

machine_idstringRequired

Machine UUID

limitinteger

Max rows to return (default: 100)

Response

Responsejson
{
  "machine_id": "550e8400-e29b-41d4-a716-446655440000",
  "data": [
    {
      "timestamp": "2024-08-26T15:30:00Z",
      "TE01_PV": 45.2,
      "PT02_PV": 3.5,
      "FT03_PV": 125.7
    },
    {
      "timestamp": "2024-08-26T15:29:00Z",
      "TE01_PV": 45.1,
      "PT02_PV": 3.5,
      "FT03_PV": 125.4
    }
  ],
  "source": "buffer"
}

Try it out

cURLbash
curl -X GET "https://sapienstream.com/api/v1/realtime-data/machines/{machine_id}/recent" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

Health Check

GET/v1/realtime-data/health
Service Health
Check the health status of the real-time data service.

Response

Responsejson
{
  "status": "healthy",
  "service": "realtime-data",
  "buffer_size": 1250,
  "active_machines": 5
}

Try it out

cURLbash
curl -X GET "https://sapienstream.com/api/v1/realtime-data/health" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json"

Storage Architecture

CSV File Structure
Data is stored in daily CSV files organized by organization and machine.
Bucket: org-{organization_id}-primary
Path: realtime-data/{machine_id}/{YYYY-MM-DD}.csv

Example:
org-123-primary/
  realtime-data/
    550e8400-e29b-41d4-a716-446655440000/
      2024-08-26.csv
      2024-08-25.csv
      2024-08-24.csv
CSV Format
Each CSV file contains timestamped data with dynamic columns for each tag.
timestamp,TE01_PV,PT02_PV,FT03_PV,VLV_001_STATUS
2024-08-26T00:00:00Z,44.8,3.4,124.2,1
2024-08-26T00:01:00Z,45.0,3.5,125.1,1
2024-08-26T00:02:00Z,45.1,3.5,125.3,0

Best Practices

Ingestion Frequency

For optimal performance, batch multiple data points together and send at 1-minute intervals rather than sending individual points. This reduces API calls and improves throughput.

Tag Naming

Use consistent tag names across ingestion calls. The system creates columns dynamically, so inconsistent naming will create multiple columns for the same measurement.

Data Retention

CSV files are retained indefinitely in the organization's storage bucket. Implement your own retention policies by deleting old files via the Storage API.