Skip to main content

How to monitor live data based on keywords.

Use-case description

You want to display in real time new mentions and articles retrieved by Talkwalker based on specific keywords and filter rules you defined. In this use-case, your system is waiting for articles and mentions pushed by Talkwalker so that it can be directly displayed.

Important

In this use-case, social media monitoring is not applicable as we collect them in the context of pages monitored inside your Talkwalker project. If you want to collect these data, you need to monitor these data from your projects, jump to the section How to monitor your Talkwalker’s project.

Create/update the stream

This stream is to monitor in real-time new articles crawled by Talkwalker in the Talkwalker global DB (no social media content) based on keywords or query filters.

More about query syntax.

To create a stream you need to provide:

  • A stream ID (defined by the consumer).
  • One or multiple query filters (called rules in the Streaming API).
note

If the stream ID already exists, the request will override the rules.

Command
curl -X PUT 'https://api.talkwalker.com/api/v3/stream/s/<stream_id>?access_token=<access_token>'
--header 'Content-Type: application/json' \
--data-raw '{
"rules": [
{
"rule_id": "rule_1",
"query": "cats AND dogs"
},
{
"rule_id": "rule_2",
"query": "electric AND lang:en AND sentiment:negative"
}
]
}'

In this sample, all new articles which match rule_1 OR rule_2 will be pushed to the stream.

Listen to the stream

General use-case

Once created, you need to listen to the stream in order to receive new articles which match your keywords.

Command
curl -X GET 'https://api.talkwalker.com/api/v3/stream/s/<stream_id>/results?access_token=<access_token>'
Sample
{
"chunk_type": "CT_CONTROL",
"chunk_control": {
"stream": [
{
"id": "stream_1",
"status": "active"
}
],
"connection_id": "#rome42r1ymtx#"
}
}
{
"chunk_type": "CT_RESULT",
"chunk_result": {
"data": {
"data": {
"url": "https://www.bbc.co.uk/news/education-64265029",
"indexed": 1673890612733,
"search_indexed": 1673890625896,
"published": 1673888410000,
"title": "Teachers from NEU union to strike in England and Wales",
"content": "Teachers from NEU union to strike in England and Wales\n\n6 minutes ago\n\nPA Media\n\nBy Vanessa Clarke\n\nEducation reporter\n\nTeach...",
"title_snippet": "Teachers from NEU union to strike in England and Wales",
"content_snippet": "Teachers from NEU union to strike in England and Wales\n\n6 minutes ago\n\nPA Media\n\nBy Vanessa Clarke\n\nEducation...",
"root_url": "https://www.bbc.co.uk/",
"domain_url": "http://bbc.co.uk/",
"host_url": "http://www.bbc.co.uk/",
"parent_url": "https://www.bbc.co.uk/news/education-64265029",
"lang": "en",
"porn_level": 0,
"fluency_level": 100,
"DEPRECATED_spam_level": 0,
"sentiment": -5,
"source_type": ["ONLINENEWS", "ONLINENEWS_TVRADIO"],
"post_type": ["TEXT"],
"tokens_title": [
"NEU union",
"and Wales",
"Teachers",
"Teachers",
"Wales",
"Wales",
"England",
"England",
"strike",
"union"
],
"tokens_content": [
"E-ACT Parkwood Academy",
"Branwen Jeffreys Matthew",
"National Education Union",
"Vanessa Clarke Education",
"Jeffreys Matthew Tyers",
"Jeffreys Matthew",
"Parkwood Academy",
"Vanessa Clarke",
"E-ACT Parkwood",
"Clarke Education"
],
"tokens_mention": ["@BBC_HaveYourSayUpload"],
"images": [
{
"url": "https://ichef.bbci.co.uk/news/976/cpsprodpb/0DAA/production/_128289430_matthew.jpg"
},
{
"url": "https://ichef.bbci.co.uk/news/976/cpsprodpb/D66A/production/_128309845_4f4a8a862119bcf8e92eba7dbf73489d0f2723f0.jpg"
}
],
"cluster_id": "http://twitter.com/FionaSimpson21/status/1615032006005358594",
"tags_internal": ["containsImage", "hasImage", "isQuestion"],
"article_extended_attributes": {
"facebook_shares": 736,
"facebook_likes": 1745,
"twitter_shares": 610,
"num_comments": 1675,
"facebook_reactions_total": 1745
},
"source_extended_attributes": {
"alexa_pageviews": 286000000,
"alexa_unique_visitors": 69417480,
"semrush_pageviews": 635249915,
"semrush_unique_visitors": 125264954
},
"extra_author_attributes": {
"id": "ex:www.bbc.co.uk-147942660",
"name": "vanessa clarke education reporter",
"gender": "FEMALE"
},
"extra_source_attributes": {
"world_data": {
"continent": "Europe",
"country": "United Kingdom",
"region": "Greater London",
"city": "London",
"longitude": -0.1153564453125,
"latitude": 51.50115966796875,
"country_code": "uk",
"resolution": "COUNTRY"
},
"id": "ex:www.bbc.co.uk",
"name": "www.bbc.co.uk"
},
"engagement": 4766,
"reach": 125264954,
"entity_url": [
{
"url": "https://www.bbc.co.uk/news/education-64238689"
},
{
"url": "http://www.bbc.co.uk/usingthebbc/privacy-policy/"
},
{
"url": "https://www.gov.uk/government/publications/handling-strike-action-in-schools",
"resolved_url": "https://www.gov.uk/government/publications/handling-strike-action-in-schools"
},
{
"url": "http://www.bbc.co.uk/usingthebbc/terms/"
},
{
"url": "https://ichef.bbci.co.uk/news/976/cpsprodpb/D66A/production/_128309845_4f4a8a862119bcf8e92eba7dbf73489d0f2723f0.jpg"
},
{
"url": "https://api.whatsapp.com/send?phone=+447756165803"
},
{
"url": "https://www.bbc.co.uk/news/uk-northern-ireland-63291611"
},
{
"url": "http://twitter.com/BBC_HaveYourSay"
},
{
"url": "https://www.bbc.co.uk/news/uk-scotland-64209973"
},
{
"url": "https://ichef.bbci.co.uk/news/976/cpsprodpb/0DAA/production/_128289430_matthew.jpg"
},
{
"url": "https://www.bbc.co.uk/send/u16904890?ptrt=https://www.bbc.co.uk/news/10725415"
}
],
"matched": {
"appearance": "UPDATED"
},
"word_count": 697,
"trending_score": 4
},
"highlighted_data": [
{
"title_snippet": "Teachers from NEU union to strike in England and Wales",
"content_snippet": "...work not worrying about bills, gas, [or] <b>electric</b> but we can come with our sole focus - of improving students&#39;...",
"matched": {
"rule_id": "rule_2",
"stream_id": "stream_1"
}
}
]
}
}
}
{
"chunk_type": "CT_RESULT",
"chunk_result": {
"data": {
"data": {
"url": "https://theathletic.com/4099180/2023/01/17/tom-brady-bucs-cowboys-recap/",
"indexed": 1673942271920,
"search_indexed": 1673942281434,
"published": 1673939645000,
"title": "Jones: Tom Brady didn’t get his storybook ending, and the next chapter is a mystery",
"content": "Monday’s 31-14 throttling at the hands of the Dallas Cowboys in an NFC wild-card playoff game was not what Brady envisioned when he ended his 40-day retirement late last winter for one more stab at glory with the Tampa B...",
"title_snippet": "Jones: Tom Brady didn’t get his storybook ending, and the next chapter is a mystery",
"content_snippet": "Monday’s 31-14 throttling at the hands of the Dallas Cowboys in an NFC wild-card playoff game was not what Brady envisioned when he ended his 40-day retirement late last winter for one more stab at glory with the Tampa Bay Buccaneers.\n\nBut storybook...",
"root_url": "https://theathletic.com/",
"domain_url": "http://theathletic.com/",
"host_url": "http://theathletic.com/",
"parent_url": "https://theathletic.com/4099180/2023/01/17/tom-brady-bucs-cowboys-recap/",
"lang": "en",
"porn_level": 0,
"fluency_level": 100,
"DEPRECATED_spam_level": 0,
"sentiment": -5,
"source_type": [
"ONLINENEWS",
"ONLINENEWS_OTHER"
],
"post_type": [
"TEXT"
],
"tokens_title": [
"Tom Brady didn",
"Tom Brady",
"Tom Brady",
"next chapter",
"his storybook",
"Brady didn",
"Jones",
"Jones",
"Brady",
"Brady"
],
"tokens_content": [
"Raymond James Stadium",
"New York Jets",
"Tampa Bay Buccaneers",
"Public Service Announcement",
"Tampa Bay",
"Tampa Bay",
"Tampa Bay",
"Dan Quinn",
"Tom Brady",
"Tom Terrific"
],
"tokens_mention": [
"@NFL",
"@NFLResearch",
"@SportsCenter"
],
"images": [
{
"url": "https://cdn.theathletic.com/app/uploads/2023/01/17011150/GettyImages-1456930345-1024x683.jpg"
}
],
"tags_internal": [
"containsImage",
"hasImage",
"isQuestion"
],
"source_extended_attributes": {
"alexa_pageviews": 13900000,
"alexa_unique_visitors": 6587678,
"semrush_pageviews": 31925992,
"semrush_unique_visitors": 18330606
},
"extra_author_attributes": {
"id": "ex:theathletic.com-1508819341",
"name": "mike jones",
"gender": "MALE"
},
"extra_source_attributes": {
"world_data": {
"continent": "North America",
"country": "United States",
"region": "Washington, D.C.",
"city": "Washington, D.C.",
"longitude": -77.0086669921875,
"latitude": 38.89984130859375,
"country_code": "us",
"resolution": "COUNTRY"
},
"id": "ex:theathletic.com",
"name": "theathletic.com"
},
"engagement": 0,
"reach": 18330606,
"entity_url": [
{
"url": "https://theathletic.com/nfl/player/mike-evans-zEpIAizdGk76yimT/"
},
{
"url": "https://twitter.com/SportsCenter/status/1615206213276389376?ref_src=twsrc%5Etfw"
},
{
"url": "https://twitter.com/NFL/status/1615208038721363970?ref_src=twsrc%5Etfw"
},
{
"url": "https://theathletic.com/nfl/team/raiders/"
},
{
"url": "pic.twitter.com/pSqCW5xzUD"
},
{
"url": "https://theathletic.com/nfl/team/buccaneers/"
},
{
"url": "https://theathletic.com/4086768/2023/01/16/cowboys-buccaneers-nfl-wild-card-streaming-takeaways/"
},
{
"url": "https://theathletic.com/nfl/"
},
{
"url": "https://t.co/pSqCW5xzUD"
},
{
"url": "https://theathletic.com/nfl/player/julio-jones-kakQorhjd9fNgZTu/"
},
{
"url": "https://theathletic.com/nfl/team/cowboys/"
},
{
"url": "https://theathletic.com/nfl/player/dak-prescott-1NrLfCowWevnKYEe/"
},
{
"url": "https://t.co/gv5PmG7iCO"
},
{
"url": "https://theathletic.com/nfl/team/jets/"
},
{
"url": "https://theathletic.com/nfl/team/patriots/"
},
{
"url": "https://theathletic.com/nfl/player/cameron-brate-jZpNv26wSUZ165Ct/"
},
{
"url": "https://theathletic.com/3998150/2022/12/16/tom-brady-2023-free-agency/"
},
{
"url": "https://twitter.com/NFLResearch/status/1615171025477701632?ref_src=twsrc%5Etfw"
},
{
"url": "pic.twitter.com/gv5PmG7iCO"
}
],
"matched": {
"appearance": "UPDATED"
},
"word_count": 1545
},
"highlighted_data": [
{
"title_snippet": "Jones: Tom Brady didn’t get his storybook ending, and the next chapter is a mystery",
"content_snippet": "...better part of the last two decades), the atmosphere at Raymond James Stadium was <b>electric</b>.\n\n“Allow me to reintroduce myself,” Jay-Z boomed over the sound system, and Brady jogged along the home sideline and into the North end zone, where he punched...",
"matched": {
"rule_id": "rule_2",
"stream_id": "stream_1"
}
}
]
}
}
}

```json
{
"chunk_type": "CT_CONTROL",
"chunk_control": {
"stream": [
{
"id": "stream_1",
"status": "active"
}
],
"connection_id": "#rome42r1ymtx#"
}
}

If you mentioned multiple rules in your stream definition, the matching rule will be returned in the object highlighted_data and the searched keywords will be surrounded by the HTML tag <b>:

"highlighted_data": [
{
"title_snippet": "Teachers from NEU union to strike in England and Wales",
"content_snippet": "...work not worrying about bills, gas, [or] <b>electric</b> but we can come with our sole focus - of improving students&#39;...",
"matched": {
"rule_id": "rule_2",
"stream_id": "stream_1"
}
}
]

The snippet can be used for direct display in your system.

info

When you launch this query, the connection will remain opened. So if you configure a timeout, you may receive an error message from your infrastructure even if the stream works properly.

###Listen to a stream based on additional keywords or query filters. You can add an additional set of keywords or filter rules to an existing stream by adding an additional “q” parameter to your call:

Command
curl -X GET 'https://api.talkwalker.com/api/v3/stream/s/<stream_id>/results?access_token=<access_token>&q=<rule_3>'

In this sample, all new articles which match (rule_1 AND rule_3) OR (rule_2 AND rule_3) will be pushed to the stream.

note

In the result, the additional q parameter will not be part of the object highlighted_data but the corresponding keywords will be surrounded by the HTML tag <b>:

streaming_monitor_keywords_highlighted.png

Control your credits or test a stream

If you want to limit the number of results in the transaction:

  • To control the number of credits consumed by your stream.
  • To test the outcome of your stream during your tests.

You can add an additional parameter max_hits to control the number of results.

Sample
curl -X GET 'https://api.talkwalker.com/api/v3/stream/s/<stream_id>/results?max_hits=<number of results>&access_token=<access_token>'
note

Once you reach the number of results, the transactions will end. New articles or mentonned will not be pushed anymore to the stream.

If I want to avoid any loss of data

Persisted data

If it’s more important to not lose any article or mention you monitor, then you can send them into a collector (it will act like a queue) and listen to the collector like a stream.

The main difference is that once the collector is created and active, all new articles and mentions will be pushed to the collector (and consume credits) even if you are not listening to the collector. The data will remain accessible in the collector for 7 days.

Create/update a collector on a stream

You can specify multiple streams which will be pushed to the collector.

You can add additional keywords to monitor in the collector.

Sample
curl -X PUT 'https://api.talkwalker.com/api/v3/stream/c/<collector_id>?access_token=<access_token>' \
--header 'Content-Type: application/json' \
--data-raw '{
"collector_query": {
"queries": [
{
"id": "rule_1",
"query": "bitcoin"
}
],
"streams": [
"stream_1"
]
}
}'

In this sample, all documents which match the stream_1 AND the collector rule_1 will be pushed in the collector.

note

if the collector ID already exists, the request will override the rules.

Read the collector.

Please refer to the section How to read the collector.