ZPK Moderation API - Technical Documentation

An API to detect harmful texts using artificial intelligence, capable of detecting threats, or messages that incite or support hate.

An API that allows you to detect harmful messages written in applications or websites. It's not just text filtering, artificial intelligence is used to contextualize messages and better determine possible harmful behavior.

Minimizing false positives: This API can more successfully discern than traditional systems between messages that mention dangerous attitudes, and messages that incite those dangerous attitudes, minimizing false positives.

Minimizing false negatives: Artificial intelligence also makes it possible to detect harmful messages even when they are not using 'filtered' words, detecting for example veiled threats, or insults that do not directly use foul language.

API connection

Server and protocols involved.

API connection details
Server:	https://zpk.systems
Scheme:	HTTPS, secure connections only.
Protocol:	JSON recommended Sending a JSON in the request body, and specifying a Content-Type: application/json header in the request Files should be sent encoded as a base64 string. Form-Data not recommended Format used in web forms, not recommended. When sending a form_data request you can send all the fields separately. However if the request is very large we recommend that you send the content of the request in a single text field called _json (with an underscore). This special parameter will be processed by our backend as if it were a regular JSON request.

The server to send the requests to is https://zpk.systems, using a secure HTTPS connection.

The recommended communication type is JSON, in which case we will specify in the connection a Content-Type: application/json, and we will send the request body as a valid JSON string.

Text scan

POST /api/moderator/scan

Parameter

Description

application_id

string required

An application id. You can get the id of an application by clicking on 'details' in your panel

api_key

string required

The API KEY of the application that makes the call to this API, you can obtain it in your panel, by clicking on the 'details' button of an application

messages

array required Max: 30

An array of elements each containing an element to be analyzed, these elements must contain at least the text to be analyzed, and optionally a source identifier and/or message identifier.

Parameter

Description

text

string required Max: 5.000 characters.

El texto a analizar, que no puede superar los 5000 caracteres

source_id

string optional Max: 15 characters.

A unique identifier of the message source, to detect harmful patterns that are repeated in the same message source.

The identifier may be linked to a specific self-imported social network feed, to comments written by a specific user, to a specific forum thread, to a user's messages in a specific chat conversation, or to any another source that you consider appropriate.

message_id

string optional Max: 15 characters.

A unique identifier of the specific text to be analyzed. This identifier is not used internally but is returned in requests, so you can identify the origin of each text if necessary.

Scan Response

Response structure

Response structure
Parameter	Description
success	boleano A boolean with value true if your request could be processed. Or false, in case of errors.
messages	array An array that will contain all the elements that were requested to be analyzed, with a score in different categories indicating possible alerts. See the following table for details about the elements of this array.
cost	float Cost of the request, in euros.

Structure of *messages* array
Parameter	Description
input	string The original text that was sent
source_id	string optional If a source_id was sent, it will contain the sent source_id.
message_id	string optional If a message_id message identifier was sent, it will contain the sent message_id identifier.
categories	Array of categories An array containing for each category an identifier of that category, a score, and an alert flag if text was detected that activates that specific category. The returned categories are: hate Hate messages, or incitement to hate. hate_threatening Threatening hate messages, such as inciting violence against a group or collective. self_harm Messages that promote, or to a lesser extent talk about, topics related to self-harm. sexual Messages of sexual content. sexual_minors Messages of sexual content related to minors. violence Messages with violent content. violence_graphic Messages with graphic violence content. Each category contains: id The category identifier: hate,hate_threatening,self_harm,sexual,sexual_minors,violence o violence_graphic score A float score that indicates the intensity of that category in the specific message. danger A message indicating alert whether the message was classified in this category true, or not false.

Example response to a success request.

In this example we can see two different messages after processing.

Response OK

{
   "success": true,
   "messages": [
      {
         "input": {
            "text": "Estoy jugando al videojuego este, pero los enemigos no paran de matarme xd los matare a todos",
            "source_id": "test",
            "message_id": ""
         },
         "categories": {
            "hate": {
               "id": "hate",
               "score": "0.000185327",
               "danger": false
            },
            "hate_threatening": {
               "id": "hate_threatening",
               "score": "0.000000335",
               "danger": false
            },
            "self_harm": {
               "id": "self_harm",
               "score": "0.000000909",
               "danger": false
            },
            "sexual": {
               "id": "sexual",
               "score": "0.000002828",
               "danger": false
            },
            "sexual_minors": {
               "id": "sexual_minors",
               "score": "0.000000048",
               "danger": false
            },
            "violence": {
               "id": "violence",
               "score": "0.767623782",
               "danger": true
            },
            "violence_graphic": {
               "id": "violence_graphic",
               "score": "0.000000240",
               "danger": false
            }
         }
      },
      {
         "input": {
            "text": "Los matare a todos",
            "source_id": "test",
            "message_id": ""
         },
         "categories": {
            "hate": {
               "id": "hate",
               "score": "0.960755408",
               "danger": true
            },
            "hate_threatening": {
               "id": "hate_threatening",
               "score": "0.702908933",
               "danger": true
            },
            "self_harm": {
               "id": "self_harm",
               "score": "0.000000001",
               "danger": false
            },
            "sexual": {
               "id": "sexual",
               "score": "0.000002006",
               "danger": false
            },
            "sexual_minors": {
               "id": "sexual_minors",
               "score": "0.000000261",
               "danger": false
            },
            "violence": {
               "id": "violence",
               "score": "0.997496307",
               "danger": true
            },
            "violence_graphic": {
               "id": "violence_graphic",
               "score": "0.000001936",
               "danger": false
            }
         }
      }
   ],
   "cost": 0.003
}

Example of response with general error

Example with an incorrect request that will NOT be processed.

Response Error

{
   "success": false,
   "errors": [
      {
         "id": "INVALID_PARAMETER",
         "message": "Messages array is empty.",
		 "on": "messages"
      }
	]
}

About scores and alerts

How to detect and filter threats and harmful messages.

This API assigns each message a score and a danger alert.

Alerts are usually activated when the score approaches 0.7, however the AI can decide whether to trigger the alert or not depending on a contextualization analysis of the message.

For example: A message could contain text talking about violence and therefore have a relatively high violence.score, but the AI could decide not to trigger a violence.danger alert if that message refers to a shooter video game and does not to a real threat.

In any case, it is you who must determine the threshold of each category that your application must tolerate, since usually the same restrictions will not apply in a community of adult audiences where all users are of legal age, or in a community suitable for all audiences.

Sources, definition and operation

Assigning scores to users, chats, threads, or any other message source.

The ZPK Moderation API will assign analyzes where a source_id has been specified to a specific source.

When a source is analyzed, or when the source status endpoint is queried, a total average score is returned that you can use to determine the 'overall' attitude of the sender of the messages.

A source can be any message source that your application determines, for example, a specific user, or a specific chat conversation.

Querying statistics about a specific source

This endpoint will return for a specific source, the current score associated with that source in each dangerous category.

It also returns information on the number of messages processed, and a 'global' attitude score.

Source stats endpoint

GET /api/moderator/source-info
Parameter	Description
application_id	string required An application id. You can get the id of an application by clicking on 'details' in your panel
api_key	string required The API KEY of the application that makes the call to this API, you can obtain it in your panel, by clicking on the 'details' button of an application
source_id	string required A unique identifier of the source

Source info, response structure

Response structure
Parameter	Description
success	boblean A boolean with value true if your request could be processed. Or false, in case of errors.
scores	Associative List An associative array that will contain the means for each threat category. Each item in this list will contain: average_score float The average score of all messages sent by this source in the specific category. percentage_total_messages float The percentage of messages that have been classified as dangerous danger in this category.
cost	float The request cost, in euros

Source Info Request, example response

Source info request response

{
   "success": true,
   "source_id": "test",
   "scores": {
      "hate": {
         "average_score": 0.3245996,
         "percentage_total_messages": 29.63
      },
      "hate_threatening": {
         "average_score": 0.1642738,
         "percentage_total_messages": 29.63
      },
      "self_harm": {
         "average_score": 0.0004603,
         "percentage_total_messages": 0
      },
      "sexual": {
         "average_score": 0.0018926,
         "percentage_total_messages": 0
      },
      "sexual_minors": {
         "average_score": 0.0001772,
         "percentage_total_messages": 0
      },
      "violence": {
         "average_score": 0.5734031,
         "percentage_total_messages": 44.44
      },
      "violence_graphic": {
         "average_score": 0.0218495,
         "percentage_total_messages": 0
      }
   },
   "cost": 0.005
}

Pricing

This API is billed on demand, and you will only pay based on the number of messages analyzed. There are no volume discounts because our API has the best price from the first request.

Price per message analyzed: € 0.001 for each 500 characters, rounded up.

Price per source statistics request: € 0.005

PHP integration

SDK in PHP to detect dangerous messages

ZPK-PHP is an open source library installable through composer that will allow you to integrate our APIs into your PHP project with minimal effort.

SDK Docs

Composer install

Terminal

composer require zpksystems/phpzpk

Example request to scan messages

Message moderation with PHP

<?php

$application = new zpkApplication('app_id','api_key');
$moderator = new zpkModerator($application);

$moderator->addText( ['text'=>'I want to k**l myself.'] );

echo json_encode($moderator->scan(),
 JSON_PRETTY_PRINT|JSON_UNESCAPED_UNICODE);

Signup to try

Login, or Signup for freee to test this API

Register Log in