Modifying documents with the Talkwalker API
Any modifications of documents done via the Talkwalker API will overwrite changes done in Talkwalker. All earlier changes (manual or via export/import) in the same project are lost.
Parameters
parameter | description | required | values |
---|---|---|---|
access_token | a read/write token specified in the API application | required | |
annotate | Talkwalker API recalculates set field based on the updated content | optional | sentiment |
return_entry | Specifies if the modified document should be returned | optional | hide (default), show |
Advanced Parameters
All parameters in this section are optional.
parameter | description | values |
---|---|---|
proc_tokens_calc | Specifies when and which tokens are generated | all_if_new (default), all_always_if_possible |
proc_lang_autodetect | Specifies when the language is automatically detected | only_if_new (default), always_if_new_content |
proc_content_ellipsis | Special behavior for ellipsis in string fields | no_special_handling, ignore_if_start_same (default) |
proc_lang_fallback | Fallback language | (default: en) |
proc_annotate_language | Language annotation | annotate_if_not_set (default), use_default |
proc_annotate_sentiment | Sentiment annotation | annotate_if_not_set (default), use_default |
Single Documents
To change result documents, use the endpoint.
Creating new documents can be done on the create
operation, updating documents is done with the update
operation.
Using the upsert
operation, a document is created if not present yet, otherwise it is updated.
Deletion and un-deletion of documents can be done on the delete
and undelete
operations respectively.
The fields url
, published
, and content
are required.
When left empty, some fields (for example source_type
, post_type
and lang
) will be filled automatically with default values or automatically extracted values if possible.
A complete overview on writable fields can be found in the chapter Talkwalker Documents.
source_type
defaults to OTHER
, so in order for a document with default source_type
to be displayed in the app, adapt the project settings accordingly.
Examples
Create
When executing create for a document that already exists, the request fails and the existing document remains unchanged. Only documents that match at least one topic in the project can be imported.
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/create?access_token=<access_token>' -d '
{
"url" : "http://www.example.com/docs/doc1.html",
"title" : "This is a title",
"content" : "Example content. Really not that much.",
"tags_marking" : ["read"],
"published" : "1430136532000",
"extra_source_attributes": {
"geo":"de"
}
}' -H 'Content-Type: application/json; charset=UTF-8'
Please see the section Talkwalker Documents for a list of all writable fields that can be imported/modified with the Talkwalker API.
Update
When executing update and the document does not yet exist, the request fails and no new document is created.
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/update?access_token=<access_token>' -d '
{
"url" : "http://www.example.com/docs/doc1.html",
"title" : "This is a new title",
"content" : "Example content. Really not that much.",
"+tags_marking" : ["important"],
"-tags_marking" : ["read"],
"extra_author_attributes" : {
"name" : null
},
"published" : "1430136532000"
}' -H 'Content-Type: application/json; charset=UTF-8'
Fields that are of type array, can be updated in three ways: using "<fieldname>"
to replace the whole array, "+<fieldname>"
to add an item to the array, and "-<fieldname>"
to remove an item.
Fields can be cleared by explicitly setting them null
.
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/update?access_token=<application_access_token>' -d '
{
"url" : "http://www.news_site.com/news/news1.html",
"+customer_entities" : [ {
"type": "Category",
"id": ["Sports","Football"]
}, {
"type": "Place",
"id": ["USA", "Austin TX"]
} ]
}' -H 'Content-Type: application/json; charset=UTF-8'
type
is the entity-type (e.g. Person
, Brand
, Category
etc), id
is the actual entity name or hierarchy (e.g. Barack Obama
, BMW
, News
etc).
Types are used for grouping entities in theme clouds, IDs are the displayed themes in the theme clouds.
Hierarchical IDs are defined as an array (the order is important!).
When multiple different entities have the same name (e.g. two persons with the same name), a unique identifier can be added after two underscores.
Max Mustermann\__1
, Max Mustermann__2
or Max Mustermann__politican
. Only the part of the ID before the underscores will be displayed in the Talkwalker user interface.
The Talkwalker user interface will only show the first 2 levels of IDs.
Upsert
The first upsert
operation works like a create
operation.
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/upsert?access_token=<access_token>' -d '
{
"url" : "http://www.example.com/docs/doc1.html",
"title" : "This is a title",
"content" : "Example content. Really not that much.",
"tags_marking" : ["read"],
"published" : "1430136532000",
"extra_source_attributes": {
"geo":"de"
}
}' -H 'Content-Type: application/json; charset=UTF-8'
The second upsert
operation works like an update
operation: content is set to a new value.
curl -XPOST 'https://api.talkwalker.com/api/v2/docs/p/<project_id>/create?access_token=<access_token>' -d '
{
"url" : "http://www.example.com/docs/doc1.html",
"content" : "Updated content."
}' -H 'Content-Type: application/json; charset=UTF-8'