Modifying documents with the Talkwalker API

danger

Any modifications of documents done via the Talkwalker API will overwrite changes done in Talkwalker. All earlier changes (manual or via export/import) in the same project are lost.

It is possible to modify documents within a specific dataset by adding the dataset ID in the URL:

Parameters

parameter	description	required	values
access_token	a read/write token specified in the API application	required
annotate	Talkwalker API recalculates set field based on the updated content	optional	sentiment
return_entry	Specifies if the modified document should be returned	optional	hide (default), show

Advanced Parameters

tip

All parameters in this section are optional.

parameter	description	values
proc_tokens_calc	Specifies when and which tokens are generated	all_if_new (default), all_always_if_possible
proc_lang_autodetect	Specifies when the language is automatically detected	only_if_new (default), always_if_new_content
proc_content_ellipsis	Special behavior for ellipsis in string fields	no_special_handling, ignore_if_start_same (default)
proc_lang_fallback	Fallback language	(default: en)
proc_annotate_language	Language annotation	annotate_if_not_set (default), use_default
proc_annotate_sentiment	Sentiment annotation	annotate_if_not_set (default), use_default
proc_translate_title_content	Specifies when the title and content are translated	only_if_new (default), never, always_if_new_content
proc_extract_aspect_sentiments	Specifies when to extract aspect sentiments	only_if_new (default), never, always_if_new_content
proc_action_on_deleted	Specifies what to do if entry has been deleted (for create and upsert operations)	nothing (default), auto_undelete

Single Documents

To change result documents, use the or endpoint.

Creating new documents can be done on the create operation, updating documents is done with the update operation. Using the upsert operation, a document is created if not present yet, otherwise it is updated. Deletion and un-deletion of documents can be done on the delete and undelete operations respectively.

The fields url, published, and content are required. When left empty, some fields (for example source_type, post_type and lang) will be filled automatically with default values or automatically extracted values if possible. A complete overview on writable fields can be found in the chapter Talkwalker Documents.

note

source_type defaults to OTHER, so in order for a document with default source_type to be displayed in the app, adapt the project settings accordingly.

Examples

Create

When executing create for a document that already exists, the request fails and the existing document remains unchanged. Only documents that match at least one topic in the project can be imported.

curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/create?access_token=<access_token>' -d '
{
  "url" : "http://www.example.com/docs/doc1.html",
  "title" : "This is a title",
  "content" : "Example content. Really not that much.",
  "tags_marking" : ["read"],
  "published" : "1430136532000",
  "extra_source_attributes": {
    "geo":"de"
  }
}' -H 'Content-Type: application/json; charset=UTF-8'

Example: Create a document in a dataset
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/d/<dataset_id>/create?access_token=<access_token>' -d '
{
  "url" : "http://www.example.com/docs/doc1.html",
  "title" : "This is a title",
  "content" : "Example content. Really not that much.",
  "published" : "1430136532000",
  "extra_source_attributes": {
    "geo":"de"
  }
}' -H 'Content-Type: application/json; charset=UTF-8'

Please see the section Talkwalker Documents for a list of all writable fields that can be imported/modified with the Talkwalker API.

Update

When executing update and the document does not yet exist, the request fails and no new document is created.

Example: Setting a new title field, adding an important tag, and removing the read tag
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/update?access_token=<access_token>' -d '
{
  "url" : "http://www.example.com/docs/doc1.html",
  "title" : "This is a new title",
  "content" : "Example content. Really not that much.",
  "+tags_marking" : ["important"],
  "-tags_marking" : ["read"],
  "extra_author_attributes" : {
    "name" : null
  },
  "published" : "1430136532000"
}' -H 'Content-Type: application/json; charset=UTF-8'

Fields that are of type array, can be updated in three ways: using "<fieldname>" to replace the whole array, "+<fieldname>" to add an item to the array, and "-<fieldname>" to remove an item. Fields can be cleared by explicitly setting them null.

curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/update?access_token=<application_access_token>' -d '
{
  "url" : "http://www.news_site.com/news/news1.html",
  "+customer_entities" : [ {
    "type": "Category",
    "id": ["Sports","Football"]
  }, {
    "type": "Place",
    "id": ["USA", "Austin TX"]
  } ]
}' -H 'Content-Type: application/json; charset=UTF-8'

type is the entity-type (e.g. Person, Brand, Category etc), id is the actual entity name or hierarchy (e.g. Barack Obama, BMW, News etc). Types are used for grouping entities in theme clouds, IDs are the displayed themes in the theme clouds. Hierarchical IDs are defined as an array (the order is important!). When multiple different entities have the same name (e.g. two persons with the same name), a unique identifier can be added after two underscores. Max Mustermann\__1, Max Mustermann__2 or Max Mustermann__politican. Only the part of the ID before the underscores will be displayed in the Talkwalker user interface. The Talkwalker user interface will only show the first 2 levels of IDs.

Upsert

The first upsert operation works like a create operation.

curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/upsert?access_token=<access_token>' -d '
{
  "url" : "http://www.example.com/docs/doc1.html",
  "title" : "This is a title",
  "content" : "Example content. Really not that much.",
  "tags_marking" : ["read"],
  "published" : "1430136532000",
  "extra_source_attributes": {
    "geo":"de"
  }
}' -H 'Content-Type: application/json; charset=UTF-8'

The second upsert operation works like an update operation: content is set to a new value.

curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/create?access_token=<access_token>' -d '
{
  "url" : "http://www.example.com/docs/doc1.html",
  "content" : "Updated content."
}' -H 'Content-Type: application/json; charset=UTF-8'

Delete

Example: Deleting a document
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/delete?access_token=<access_token>' -d '
{
  "url" : "http://www.example.com/docs/doc1.html"
}' -H 'Content-Type: application/json; charset=UTF-8'

Example: Deleting a document from a dataset
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/d/<dataset_id>/delete?access_token=<access_token>' -d '
{
  "url" : "http://www.example.com/docs/doc1.html"
}' -H 'Content-Type: application/json; charset=UTF-8'

Undelete

Example: Undeleting a document
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/undelete?access_token=<access_token>' -d '
{
  "url" : "http://www.example.com/docs/doc1.html"
}' -H 'Content-Type: application/json; charset=UTF-8'

Credits

No credits for document creation/update and document delete calls.

Rate Limit

This endpoint is limited to 120 calls per minute.

Multiple Documents

Multiple documents can be manipulated using the or endpoint.

The execution order of the given document operations is not guaranteed (multiple operations on a single document in a single request should be avoided).

curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>?access_token=<access_token>' -d '
[{
  "create": {
    "url": "http://www.example.com/docs/doc1.html",
    "title" : "This is the title of doc 1",
    "content" : "and this is the content of doc 1",
    "published" : "1430136532000"
  }
}, {
  "update": {
    "url": "http://www.example.com/docs/doc2.html",
    "title" : "This is the title of doc 2",
    "content" : "and this is the content of doc 2",
  }
}, {
  "delete": {
    "url": "http://www.example.com/docs/doc3.html"
  }
}]' -H 'Content-Type: application/json; charset=UTF-8'

sample in a dataset
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>d/<dataset_id>?access_token=<access_token>' -d '
[{
  "create": {
    "url": "http://www.example.com/docs/doc1.html",
    "title" : "This is the title of doc 1",
    "content" : "and this is the content of doc 1",
    "published" : "1430136532000"
  }
}, {
  "update": {
    "url": "http://www.example.com/docs/doc2.html",
    "title" : "This is the title of doc 2",
    "content" : "and this is the content of doc 2",
  }
}, {
  "delete": {
    "url": "http://www.example.com/docs/doc3.html"
  }
}]' -H 'Content-Type: application/json; charset=UTF-8'

Please see the section Talkwalker Documents for a list of all writable fields that can be imported/modified with the Talkwalker API.

If one or more operations fail the response will have the status code 49 and the response will include details of the failure.

danger

The HTTP code of the response is still 200, even if some operations failed. This means that every document modification in the response needs to be checked separately for a (partial) failure.

Credits

No credits for document creation/update and document delete calls.

Rate Limit

This endpoint is limited to 120 calls per minute.

Custom Fields

In order to create a new document that includes custom fields, these custom fields have to be added to the project first (see Project Settings/Scoring Engine/Custom Metrics in the Talkwalker application). We assume a project that has two custom fields: One decimal number double_1 and one String text_1.

A document like the following, including these two custom fields, can be created for this project:

curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/create?access_token=<access_token>' -d '
{
  "url" : "http://www.example.com/docs/doc1.html",
  "title" : "This is a match",
  "content" : "Example content. Really not that much.",
  "tags_marking" : ["read"],
  "published" : "1430136532000",
  "custom": {
    "text_1" : "Example text. Also not that much.",
    "double_1" : 0.75
  }
}' -H 'Content-Type: application/json; charset=UTF-8'

When matching a custom field, the result has the following form.

{
  "chunk_type" : "CT_RESULT",
  "chunk_result" : {
    "data" : {
      "data" : {
        ...
        "custom" : {
          "test_1" : "Example text. Also not that much."
        }
      },
      "highlighted_data" : [ {
        "title_snippet" : "This is a <b>match</b>",
        "content_snippet" : "Example content. Really not that much.",
        "matched" : {
          "project_id" : "<project_id>",
          "project_profiles" : [ {
            "id" : "jl124veg_32fssfsd9238",
            "type" : "topic",
            "title" : "Match"
          } ]
        }
      } ]
    }
  }
}

Modifying documents with the Talkwalker API

Parameters​

Advanced Parameters​

Single Documents​

Examples​

Create​

Update​

Upsert​

Delete​

Undelete​

Credits​

Rate Limit​

Multiple Documents​

Credits​

Rate Limit​

Custom Fields​

Parameters

Advanced Parameters

Single Documents

Examples

Create

Update

Upsert

Delete

Undelete

Credits

Rate Limit

Multiple Documents

Credits

Rate Limit

Custom Fields