Talkwalker Search Histogram outside of any project

https://api.talkwalker.com/api/v1/search/histogram/<type>

How it works

With the Talkwalker Search Histogram API, you can retrieve the distribution of the number of search results for a given search query. Histograms can be made for distribution over time or over specific metrics (number of comments, number of shares, reach, retweets etc.). By setting min and max a histogram can be limited to a specific range (min_include and max_include control if those bounds are included). This can be a time range for published/search_indexed or just some upper and lower cap for e.g. engagement histograms. interval defines the width of the bins, the accepted values are long integers for metrics or duration values (like 7d for 7 days) for published and search_indexed dates. When using a bin size of entire days, timezone allows to set a timezone to specify the begin and end of the days.

note

The 30-day-limitation on global data also holds for histograms.

Histogram types

type	Description	Representation
published	Timestamp of publication (epoch time in milliseconds)	Histogram
search_indexed	Timestamp of indexation in Talkwalker (epoch time in milliseconds)	Histogram
reach	The reach of an article/post represents the number of people who were reached by this article/post.	Histogram
engagement	The engagement of an article/post is the sum of actions made by others on that article/post.	Histogram
facebook_shares	Number of Facebook shares an article has	Histogram
facebook_likes	Number of Facebook likes an article has	Histogram
twitter_retweets	Number of Twitter retweets an article has	Histogram
twitter_shares	Number of Twitter share an article has	Histogram
twitter_likes	Number of Twitter likes an article has	Histogram
twitter_followers	Number of Twitter followers a source has	Histogram
twitter_impressions	Number of Twitter impressions an article has	Histogram
twitter_video_views	Number of Twitter video views an article has	Histogram
instagram_likes	Number of Instagram likes an article has	Histogram
youtube_views	Number of YouTube views a video has	Histogram
youtube_likes	Number of YouTube likes a video has	Histogram
youtube_dislikes	Number of YouTube dislikes a video has	Histogram
comment_count	Number of comments an article has	Histogram
language	Number of documents written in a language	Top-N Distribution
country	Number of documents with a source from a certain country	Top-N Distribution
source_region	Number of documents with a source from a certain region, depends on geolocation resolution	Top-N Distribution
source_city	Number of documents with a source from a certain city, depends on geolocation resolution	Top-N Distribution
gender	Number of documents written by an author of a particular gender	Top-N Distribution
age	Number of documents written by an author in a predefined age group	Distribution
unique_author	Total number of different authors	Distribution
theme_cloud	percent of documents containing a particular word or hashtag	Top-N Distribution
trending_top_theme	percent of documents containing a particular word or hashtag as compared to previous period	Top-N Distribution
hashtag	Number of documents containing a particular hashtag	Top-N Distribution
mentions	Number of documents mentioning a particular account	Top-N Distribution
trending_mentions	Number of documents mentioning a particular account as compared to previous period	Top-N Distribution
emoji	Number of documents containing a particular emoji code	Top-N Distribution
smart_theme	percent of documents within a particular Smart Theme group	Top-N Distribution
interest	Number of documents within a particular interest group	Top-N Distribution
occupation	Number of documents within a particular occupation group	Top-N Distribution
sentiment	Number of documents with a particular sentiment	Distribution

Parameters

parameter	description	required?	allowed values	default value
access_token	a read/write token specified in the API application	required
q	The query to search for	required	Talkwalker query syntax
min	Minimum value for bins	optional	Long Integer value	For `published`: tomorrow - 8 days or max - 8 days
max	Maximum value for bins	optional	Long Integer value	For `published`: tomorrow or min + 8 days
min_include	Include min value	optional	`true` / `false`	true
max_include	Include max value	optional	`true` / `false`	false
interval	Bin Interval	optional	Duration for `published` and `search_indexed` / Integer for histogram / not used for distribution	dynamic
timezone	Timezone	optional	tz database: timezone name (e.g. `Europe/Luxembourg`, `Australia/Perth`)	UTC
breakdown	Nested histogram	optional	`sentiment`, `sourcetype`, `country`	-
value_type	Nested metric for time based histograms	optional	metric historgram types	-
top_n	Size limiter for demographic distribution	optional	Integer value in ]0, 100]	10
time_range	Time range filter	optional	format `number` + a time unit character (e.g. `30d` for 30 days.)	`7d` for trending types, otherwise none
percentage_relation	Specify the relation for theme clouds	optional	`breakdown`, `query`, `total`	`breakdown`
tokenizing_mode	Tokenizing mode for theme cloud histograms	optional	normal, two_grams, three_grams, noun_phrase, verb_phrase	normal
smart_theme	Smart Theme	required	brands, celebrities, emotions, events	normal

Possible values for interval when creating a histogram over published or search_indexed: year, quarter, month, week, day, hour, minute, second as well as numeric values with the units w (week), d (day), h (hours), m (minutes), and s (seconds). (e.g. 5d for 5 days or 2w for 2 weeks).
The maximum number of histogram bins is 400, if the min, max and interval parameters result in a larger number of bins, an error message (HTTP 400) is returned. Try reducing the range or increasing the interval.
value_type allows specifying a type for nested statistics per bin in a histogram over published or search_indexed.
Possible values for time_range as time unit characters are: s for seconds, m for minutes, h for hours, d for days, w for weeks and M for months.
smart_theme parameter is only required for smart_theme API endpoint.

Since some parameters are only used by certain histogram types, the following table provides an overview of all working combinations.

	access_token q timezone time_range	min max min_include max_include interval	forecast_days value_type	breakdown	top_n	percentage_relation tokenizing_mode	smart_theme
published search_indexed	x	x	x	x
engagement reach facebook_shares facebook_likes twitter_shares twitter_retweets twitter_followers twitter_likes twitter_impressions twitter_video_views youtube_likes youtube_dislikes youtube_views instagram_likes cluster_size comment_count	x	x
sentiment unique_author age	x
interest occupation hashtag emoji mentions trending_mentions language country source_region source_city gender	x				x
theme_cloud	x			x	x	x
trending_top_theme	x				x	x
smart_theme	x				x		x

10 credits per call.

Histogram Examples

Get a histogram over the last 8 days of results containing the word "cats" for Australian time

Set the query to cats. For type published the Talkwalker Search Histogram API returns results over the last seven days by default.

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/published?access_token=demo&timezone=Australia/Perth&q=cats&interval=day&pretty=true'

Response
{
    "status_code": "0",
    "status_message": "OK",
    "request": "GET /api/v1/search/histogram?access_token=demo&q=cats&interval=day",
    "result_histogram": {
        "header": {
            "v": ["Number Results"]
        },
        "data": [
            {
                "t": 1417478400000,
                "v": [4366.0]
            },
            {
                "t": 1417564800000,
                "v": [3385.0]
            },
            {
                "t": 1417651200000,
                "v": [4233.0]
            },
            {
                "t": 1417737600000,
                "v": [4071.0]
            },
            {
                "t": 1417824000000,
                "v": [2571.0]
            },
            {
                "t": 1417910400000,
                "v": [2191.0]
            },
            {
                "t": 1417996800000,
                "v": [3275.0]
            },
            {
                "t": 1418083200000,
                "v": [1140.0]
            }
        ]
    }
}

t indicates the time-based lower bound of the current bucket, while v is the number of elements inside that bucket.

Get a histogram with a resolution of 6 hours over the last 7 days of results containing the word "cats"

Set interval to 6h for 4 values per day.

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/published?access_token=demo&q=cats&interval=6h'

The interval parameter accepts the values year, quarter, month, week, day, hour, minute, second as well as numeric values with the units w (week), d (day), h (hours), m (minutes), and s (seconds).

Get a histogram over a specific time window

important

Due to the 30-day-limitation on global data, please replace the timestamps in the following example by recent values.

Set min to 1601510400000 and max to 1601856000000 to get a histogram of results published between 01.10.2020 and 05.01.2020 with start timestamp included and end timestamp excluded (default values).

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/published?access_token=demo&q=cats&min=1601510400000&max=1601856000000'

Response
  "status_code": "0",
  "status_message": "OK",
  "request": "GET /api/v1/search/histogram/published?access_token=demo&q=cats&min=1601510400000&max=1601856000000",
  "result_histogram": {
    "header": {
      "v": [
        "Number Results"
      ]
    },
    "data": [
      {
        "t": 1601510400000,
        "v": [
          19123.0
        ]
      },
      {
        "t": 1601596800000,
        "v": [
          18855.0
        ]
      },
      {
        "t": 1601683200000,
        "v": [
          20678.0
        ]
      },
      {
        "t": 1601769600000,
        "v": [
          14820.0
        ]
      }
    ]
  }
}

The min and max parameters accept timestamps in epoch format (milliseconds after 1.1.1970 UTC).

Special attention needs to be paid when working with the timezone parameter. In the above example, we get one result value for each started day in the respective timezone, amounting to a total of 4 values.

We repeat the call from before, but this time, we set timezone to Asia/Tokyo (UTC+9).

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/published?access_token=demo&q=cats&min=1601510400000&max=1601856000000&timezone=Asia%2FTokyo'

Response
{
    "status_code": "0",
    "status_message": "OK",
    "request": "GET /api/v1/search/histogram/published?access_token=demo&q=cats&min=1601510400000&max=1601856000000&timezone=Asia%2FTokyo",
    "result_histogram": {
        "header": {
            "v": ["Number Results"]
        },
        "data": [
            {
                "t": 1601478000000,
                "v": [10329.0]
            },
            {
                "t": 1601564400000,
                "v": [19244.0]
            },
            {
                "t": 1601650800000,
                "v": [22390.0]
            },
            {
                "t": 1601737200000,
                "v": [15045.0]
            },
            {
                "t": 1601823600000,
                "v": [6468.0]
            }
        ]
    }
}

This time, we get 5 result values. This is due to min and max, which resolve to different times. In the first example, min and max resolve to 01.10.2020 00:00:00 UTC and 05.10.2020 00:00:00 UTC respectively. In this second example, they resolve to 01.10.2020 09:00:00 JST and 05.10.2020 09:00:00 JST respectively.

In other words, we get 15 instead of 24 hours worth of data for 01.10.2020 and we get 9 hours worth of data for 05.10.2020. When changing timezone, min and max need to be adjusted accordingly.

Get a histogram and statistics over engagement

For types different from published and search_indexed, the histogram API also returns statistics (average, minimum, maximum and sum) over every bin.

(For published and search_indexed, we can specify an additional metric for statistics with the value_type parameter.) We can use the min and max parameters to only consider documents whose engagement lies within a specific range.

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/engagement?access_token=demo&q=cats&min=1000&max=2000'

response
{
  "status_code": "0",
  "status_message": "OK",
  "request": "GET /api/v1/search/histogram/engagement?access_token=demo&q=cats&min=1000&max=2000",
  "result_histogram": {
    "data": [
      {
        "v": [
          33.0
        ],
        "k": 1000.0,
        "val": [
          {
            "count": 33,
            "min": 1002.0,
            "max": 1097.0,
            "avg": 1051.121212121212,
            "sum": 34687.0
          }
        ]
      }, (...)  {
        "v": [
          14.0
        ],
        "k": 1900.0,
        "val": [
          {
            "count": 14,
            "min": 1906.0,
            "max": 1992.0,
            "avg": 1944.9285714285713,
            "sum": 27229.0
          }
        ]
      }
    ]
  }
}

k indicates the number-based lower bound of the current bucket, while v is the number of elements inside that bucket. The content of val contains additional information about the elements of that bucket.

Get a histogram with a breakdown over sentiment

For time-based histograms (type: published or search_indexed), it is possible to add a breakdown parameter, set to either sentiment, sourcetype or country. The header then contains the different values for the chosen breakdown type, while the data field v lists in the same order the number of matching elements from a bucket.

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/published?access_token=demo&q=cats&breakdown=sentiment'

response
{
  "status_code" : "0",
  "status_message" : "OK",
  "request" : "GET /api/v1/search/histogram/published?access_token=demo&q=cats&breakdown=sentiment&pretty=true",
  "result_histogram" : {
    "header" : {
      "v" : [ "POSITIVE", "NEUTRAL", "NEGATIVE" ]
    },
    "data" : [ {
      "t" : 1577923200000,
      "v" : [ 3944.0, 10488.0, 1732.0 ]
    } ...
    // truncated
    {
      "t" : 1578528000000,
      "v" : [ 922.0, 2814.0, 573.0 ]
    } ]
  }
}

Top N distribution examples

While it is possible to specify N by using the top_n parameter, a default value of 10 is set. The output for Top N distributions is very similar to the histogram output, where the main differences are:

The total hit number is contained in the header
The key is stored in the ks field

Get the Top 3 languages

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/language?access_token=demo&q=cats&top_n=3&pretty=true'

Response
{
    "status_code": "0",
    "status_message": "OK",
    "request": "GET /api/v1/search/histogram/language?access_token=demo&q=cats&top_n=3&pretty=true",
    "result_histogram": {
        "data": [
            {
                "v": [550242.0],
                "ks": "en"
            },
            {
                "v": [6918.0],
                "ks": "de"
            },
            {
                "v": [1882.0],
                "ks": "ja"
            },
            {
                "v": [1711.0],
                "ks": "fr"
            }
        ],
        "total_hits": 570382
    },
    "request_id": "#qx0h3vup7r8d#"
}

ks serves as string-based indicator of a bucket, while v represents the number of elements inside. The number of total results is provided in the total_hits.

Get the Top 3 themes

The top N themes differ from other Top N distributions in the way that they are calculated based on a sample of the total documents. Thus, the result does not contain the number of documents from the sample, which include a certain token, but the percentage.

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/theme_cloud?access_token=demo&q=cats&top_n=3&pretty=true'

response
{
    "status_code": "0",
    "status_message": "OK",
    "request": "GET /api/v1/search/histogram/theme_cloud?access_token=demo&q=cats&top_n=3&pretty=true",
    "result_histogram": {
        "data": [
            {
                "v": [0.996],
                "ks": "cats"
            },
            {
                "v": [0.265],
                "ks": "cat"
            },
            {
                "v": [0.193],
                "ks": "dogs"
            }
        ],
        "total_hits": 570399
    },
    "request_id": "#qx0h7rc0i29n#"
}

ks serves as string-based indicator of a bucket, while v represents the percentage of elements inside results. The number of total results is provided in the total_hits.

Get the Top 3 emoji

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/emoji?access_token=demo&q=cats&top_n=3&pretty=true'

response
{
    "status_code": "0",
    "status_message": "OK",
    "request": "GET /api/v1/search/histogram/emoji?access_token=demo&q=cats&top_n=3&pretty=true",
    "result_histogram": {
        "data": [
            {
                "v": [5289.0],
                "ks": "❤"
            },
            {
                "v": [3349.0],
                "ks": "😂"
            },
            {
                "v": [2428.0],
                "ks": "🐱"
            }
        ],
        "total_hits": 570083
    },
    "request_id": "#qx0br7wh57pp#"
}

ks serves as string-based indicator of a bucket (JAVA source code encoded), while v represents the number of elements inside. The number of total results is provided in the total_hits.

Get the Top 3 regions

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/source_region?access_token=demo&q=cats&top_n=3&pretty=true'

response
{
    "status_code": "0",
    "status_message": "OK",
    "request": "GET /api/v1/search/histogram/source_region?access_token=demo&q=cats&top_n=3&pretty=true",
    "result_histogram": {
        "data": [
            {
                "v": [2006.0],
                "key": {
                    "country_code": "us",
                    "region": "California",
                    "short_id": "california_us"
                }
            },
            {
                "v": [1599.0],
                "key": {
                    "country_code": "ph",
                    "region": "Isabela (province)",
                    "short_id": "isabela_ph"
                }
            },
            {
                "v": [786.0],
                "key": {
                    "country_code": "us",
                    "region": "New York",
                    "short_id": "newyork_us"
                }
            }
        ],
        "total_hits": 17914
    },
    "request_id": "#qyc8lsx8lof6#"
}

region serves as string-based indicator of a bucket, while v represents the number of elements inside. The number of total results is provided in the total_hits.

Get the Top 3 cities

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/source_city?access_token=demo&q=cats&top_n=3&pretty=true'

response
{
    "status_code": "0",
    "status_message": "OK",
    "request": "GET /api/v1/search/histogram/source_city?access_token=demo&q=cats&top_n=3&pretty=true",
    "result_histogram": {
        "data": [
            {
                "v": [1598.0],
                "key": {
                    "country_code": "ph",
                    "region": "Isabela (province)",
                    "city": "Jones, Isabela",
                    "short_id": "jones_isabela_ph"
                }
            },
            {
                "v": [948.0],
                "key": {
                    "country_code": "us",
                    "region": "California",
                    "city": "San Francisco",
                    "short_id": "sanfrancisco_california_us"
                }
            },
            {
                "v": [737.0],
                "key": {
                    "country_code": "us",
                    "region": "South Dakota",
                    "city": "Sioux Falls, South Dakota",
                    "short_id": "siouxfalls_southdakota_us"
                }
            }
        ],
        "total_hits": 12971
    },
    "request_id": "#qyc8uo6qqxiz#"
}

city serves as string-based indicator of a bucket, while v represents the number of elements inside. The number of total results is provided in the total_hits.

Distribution examples

Contrary to the Top N distribution, no parameter is required. Other than that, both share the same structure:

The total hit number is contained in the header
The key is stored in the ks field

Get the sentiment distribution

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/sentiment?access_token=demo&q=cats&pretty=true'

response
{
    "status_code": "0",
    "status_message": "OK",
    "request": "GET /api/v1/search/histogram/sentiment?access_token=demo&q=cats&pretty=true",
    "result_histogram": {
        "data": [
            {
                "v": [327698.0],
                "ks": "NEUTRAL"
            },
            {
                "v": [149214.0],
                "ks": "POSITIVE"
            },
            {
                "v": [93485.0],
                "ks": "NEGATIVE"
            },
            {
                "v": [0.0],
                "ks": "NONE"
            }
        ],
        "total_hits": 570397
    },
    "request_id": "#qx0hczxspbq8#"
}

Get the number of unique authors

While the number of unique authors is a single value, it shares the same structure than distributions, thus the categorization.

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/unique_author?access_token=demo&q=cats&pretty=true'

response
{
    "status_code": "0",
    "status_message": "OK",
    "request": "GET /api/v1/search/histogram/unique_author?access_token=demo&q=cats&pretty=true",
    "result_histogram": {
        "data": [
            {
                "v": [306021.0],
                "ks": "value"
            }
        ],
        "total_hits": 570392
    },
    "request_id": "#qx0hf66t2vwz#"
}

Rate Limit

This endpoint is limited to 60 calls per minute.

Multiple query examples

It is possible to enter multiple queries when working with histograms. All histogram types are compatible will multiple queries. The result structure is similar to that of a histogram with breakdown parameter: query contains all entered queries, while v and in some cases val contain the respective results in the same order.

Simple example

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/published?access_token=demo&q=cats&q=dogs'

response
{
  "status_code" : "0",
  "status_message" : "OK",
  "request" : "GET /api/v1/search/histogram/published?access_token=demo&q=cats&q=dogs&pretty=true",
  "result_histogram" : {
    "header" : {
      "v" : [ "Number Results" ]
    },
    "data" : [ {
      "t" : 1577923200000,
      "v" : [ 123.0, 456.0 ]
    } ...
    // truncated
    {
      "t" : 1578528000000,
      "v" : [ 789.0, 1011.0 ]
    } ],
    "query" : [ "cats", "dogs" ]
  }
}

Top N example

Top N histogram types like language can also be used with multiple queries. The result contains all top N values for the different queries, which may be more than N in total, as shown in the example below.

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/language?access_token=demo&q=dogs&q=cats&top_n=3'

response
{
  "status_code" : "0",
  "status_message" : "OK",
  "request" : "GET /api/v1/search/histogram/language?access_token=demo&q=dogs&q=cats&pretty=true",
  "result_histogram" : {
  "data" : [
    {
      v: 377.0
      v: 483.0
      ks: "en"
    }, {
      v: 215.0
      v: 251.0
      ks: "ja"
    }, {
      v: -1.0
      v: 123.0
      ks: "ar"
    }, {
      v: 87.0
      v: -1.0
      ks: "es"
    }
  ],
  "total_hits" : 1700
  "total_query_hits" : [ 721, 1234]
    "query" : [ "dogs", "cats" ]
  }
}

Searching for the top 3 languages for each query, the third place is different for both, so the result contains 4 keys in total. A -1.0 marker is placed where one query does not have a value belonging to the top N. The results are sorted by the sum of the values, not taking the -1.0 markers into consideration.

Multiple queries + breakdown example

Command
curl 'https://api.talkwalker.com/api/v1/search/histogram/published?access_token=demo&q=dogs&q=cats&breakdown=sentiment'

response
{
  "status_code" : "0",
  "status_message" : "OK",
  "request" : "GET /api/v1/search/histogram/published?access_token=demo&q=dogs&q=cats&pretty=true",
  "result_histogram" : {
    "header" : {
      "v" : [ "POSITIVE", "NEUTRAL", "NEGATIVE" ]
    },
    "data" : [ {
      "t" : 1577923200000,
      "v" : [ 41.0, 38.0, 44.0, 152.0, 140.0, 164.0 ]
    } ...
    // truncated
    {
      "t" : 1578528000000,
      "v" : [ 200.0, 250.0, 339.0, 333.0, 350.0, 328.0 ]
    } ],
    "query" : [ "dogs", "cats" ]
  }
}

The content of v is ordered by query first and breakdown second. For two queries q1, q2 and a breakdown over sentiment, this results in the following ordering: [ <q1, POSITIVE>, <q1, NEUTRAL>, <q1, NEGATIVE>, <q2, POSITIVE>, <q2, NEUTRAL>, <q2, NEGATIVE> ]

As an illustration, compare the values for t = 1577923200000 in the table below with the example above.

	Dogs	Cats
Positive	41.0	152.0
Neutral	38.0	140.0
Negative	44.0	164.0

Talkwalker Search Histogram outside of any project

How it works​

Histogram types​

Parameters​

Histogram Examples​

Get a histogram over the last 8 days of results containing the word "cats" for Australian time​

Get a histogram with a resolution of 6 hours over the last 7 days of results containing the word "cats"​

Get a histogram over a specific time window​

Get a histogram and statistics over engagement​

Get a histogram with a breakdown over sentiment​

Top N distribution examples​

Get the Top 3 languages​

Get the Top 3 themes​

Get the Top 3 emoji​

Get the Top 3 regions​

Get the Top 3 cities​

Distribution examples​

Get the sentiment distribution​

Get the number of unique authors​

Rate Limit​

Multiple query examples​

Simple example​

Top N example​

Multiple queries + breakdown example​