Streaming API Overview & Example
How it works
The Talkwalker Streaming API delivers real-time data through a persistent connection to our servers. Configure your stream with a set of filtering rules, connect to the stream and new results will be delivered in real time, as soon as they are found by our crawlers. You will not need to do any polling to receive new data.
You setup and configure the Streaming API by defining rules (Boolean query, language, media types, etc.). The Streaming API then finds and collects all relevant data and adds it to your data stream, with individually highlighted snippets per matched rule. This feature allows you to gather data from many rules through a single stream while easily matching the results back to your predefined rules.
Each rule allows filtering by title, content, author, language, URL, country, media type, and more parameters, using the same syntax as in our Talkwalker Search interface. You can also apply a list of sources to be included or excluded from the stream, to give you even further possibilities to narrow down the results you will get. A single rule can support up to 50 operands. To create complex rules, operands may be combined using Boolean Operators.
The documents are streamed in the order they are found by our crawlers and added to Talkwalker (i.e. by search_indexed
timestamp). Custom sorting is not possible with the Streaming API (however this can be done with the Search API). The documents are grouped in timeframes which contain all documents that were indexed between the given start and end time of the timeframe.
Each result (independent on how many rules match) will be counted as 1 credit.
- Updates made by an agent on the document (sentiment change, tagging, ...) will not be transmitted to the streaming API.
- Automatic tags applied by rules are also not present.
A brief example
The Talkwalker API streaming endpoint is used to stream results from Talkwalker.
Creating a Stream
curl -XPUT 'https://api.talkwalker.com/api/v3/stream/s/teststream?access_token=demo'
-d '{ "rules" : [{ "rule_id": "rule-1", "query": "cats" }] }'
-H 'Content-Type: application/json; charset=UTF-8'
{
"status_code": "0",
"status_message": "OK",
"request": "PUT /api/v3/stream/s/teststream?access_token=demo",
"result_stream": {
"data": [
{
"stream_id": "teststream",
"rules": [
{
"rule_id": "rule-1",
"query": "cats"
}
]
}
]
}
}
Streaming
curl -XGET 'https://api.talkwalker.com/api/v3/stream/s/teststream/results?access_token=demo'
The response is a stream of chunks, chunks contain meta data (CT_CONTROL
) on the Talkwalker stream or search results (CT_RESULT
).
{
"chunk_type" : "CT_CONTROL",
"chunk_control" : {
"stream" : [ {
"id" : "teststream",
"status" : "active"
} ],
"connection_id" : "#piv2tmbz2yxu#"
}
}
{
"chunk_type": "CT_RESULT",
"chunk_result": {
"data" : {
"data" : {
"url" : "http://example.blogspot.com/cats",
"indexed" : 1417999367498,
"search_indexed" : 1417999504832,
"published" : 1417999319393,
"title" : "Something cats",
"content" : "Welcome to my colorful little island (...)",
"title_snippet" : "Color and Light Inspirations in Jewelry: SUNNY RINGS :)",
"root_url" : "http://example.blogspot.com/",
"domain_url" : "http://blogspot.com/",
"host_url" : "http://example.blogspot.com/",
"parent_url" : "http://example.blogspot.com/2014/12/sunny-rings.html",
"lang" : "en",
... // truncated
}
}
}
}
More on the Talkwalker Streaming API