Skip to main content

Field Information

Content

Talkwalker provides result snippets for all content. In all cases, the content field only contains the first words of the document, in addition, we provide the part of the document which matches the query in the content_snippet field. In the Streaming API a snippet is provided for every matching rule.

URLs

To filter on specific websites in a query, the fields domain_url and host_url can be used. host_url is used for specific hosts like www.talkwalker.com or blog.talkwalker.com, while domain_url would filter on all host in a specific domain (i.e. domain_url:blog.talkwalker.com would return all results of the domain talkwalker.com also those from www.talkwalker.com while host_url:blog.talkwalker.com would return only results from blog.talkwalker.com not from www.talkwalker.com).

Sentiment

Talkwalker uses natural language processing (NLP) to compute a general sentiment for the documents in our index. The accuracy of automatic detection is limited by irony, sarcasm and misspellings in the documents.

Reach

The reach of an article/post represents the number of people who were reached by this article/post.

Note that the views only get set to a proper value if the host of the URL is either a domain (like theguardian.com) or if it is a domain with a well-known 3rd-level-subdomain in front (mainly applies to www, e.g. www.theguardian.com).

Reach is set to 0 for other hosts, i.e. hosts with other 3rd-level-subdomains, like on foobar.blogspot.com, as using the Alexa views of the domain would assign much too high reach to mere sub-hosts otherwise. For imported documents reach can be set via the Talkwalker API.

Reach is calculated in the following ways:

  • Blogs; News Sites; Forums: Number of Page Views
  • Facebook: The Number of Fans of the Page (Note: Only available for public pages, which are monitored by Talkwalker, we don't collect any fan counts for user profiles)
  • Twitter: The number of followers of the author