Twingly Blog Search API

Deprecation Notice

Blog Search API v2 and below has been deprecated and should not be used. Please use the latest version of Blog Search API.

Introduction

Twingly Blog Search API (previously known as Analytics API) is a commercial XML over HTTP API that enables machine access to Twingly’s blog search index. Currently, the last 12 months of data is searchable through the API. To be able to retrieve data through the API an API key issued by Twingly must be used. The API key then grants access to blog data for one or more languages.

Note: Twingly Blog Search API was previously known as Analytics API. All references to the old name, Twingly Analytics API, have not yet been updated.

API Endpoint

A GET request to /Analytics.ashx retrieves blog posts which match the specified query. The blog posts are by default returned in date order starting with the newest. At most 1000 hits are returned for a given query.

HTTP options

Request Parameters

Parameter values must be URL encoded.

Parameter Notes
key Required. The API key provided by Twingly.
searchpattern Required. The search query, see our search language documentation.
documentlang Restricts the query to a specific language, for example en or sv, list of all supported languages. If omitted, posts of all languages will be returned. Note that some API keys only grant access to specific language(s).
ts To only include posts published after the given UTC timestamp. Example: 2014-08-18T00:00:00Z. This is the same as using start-date in the search pattern. Note that ts has lower precedence.
tsTo To only include posts published before the given UTC timestamp. This is the same as using end-date in the search pattern. Note that tsTo has lower precedence.
xmloutputversion 0 - the original (deprecated) format, 1 - new format without blogrank, 2 - new format with blogrank.

The new format includes the total number of matches, index time of post and tags associated with the posts, while the original format does not. Default is 0, but we recommend to use the newer format.

Deprecated Parameters

The following parameters are deprecated and should not be used anymore as they may return unexpected results.

Parameter Notes
approved Limits the search result to approved (spam-free) blogs only. The approval system was discontinued in 2012, the data is consequently stale.
fields To search in only the title or the summary, this parameter has no effect.
product Can be either microblogsearch or search, no API keys for microblogsearch are issued as we have stopped indexing microblogs. The product parameter should be omitted.

Example searchpattern

This will search for all posts with tag “fashion” from the blog blogg.veckorevyn.com/fridagrahn, the result will be returned in ascending order by publish date.

tag:fashion blog:blogg.veckorevyn.com/fridagrahn sort:published sort-order:asc

See our search language documentation for definitions.

Example Request

With searchpattern: twingly

curl -s "https://api.twingly.com/analytics/Analytics.ashx?key=KEY&searchpattern=twingly&ts=2014-08-27+10%3a00%3a00&tsTo=2014-08-27+11%3a00%3a00&documentlang=sv&xmloutputversion=2"

Response

The response is XML formatted, depending upon the xmloutputversion parameter the XML structure will vary, only the new format with blogrank is presented here.

<?xml version="1.0" encoding="UTF-8"?>
<twinglydata numberOfMatchesReturned="1000" secondsElapsed="0.148" numberOfMatchesTotal="19017">
    <post contentType="blog">
        <url>http://oppogner.blogg.no/1409602010_bare_m_ha.html</url>
        <title><![CDATA[ Bare MÅ ha! ]]></title>
        <summary><![CDATA[Ja, velkommen til høsten ...]]></summary>
        <languageCode>no</languageCode>
        <published>2014-09-02 06:53:26Z</published>
        <indexed>2014-09-02 09:00:53Z</indexed>
        <blogUrl>http://oppogner.blogg.no/</blogUrl>
        <blogName><![CDATA[ oppogner ]]></blogName>
        <authority>1</authority>
        <blogRank>1</blogRank>
        <tags>
            <tag><![CDATA[ Blogg ]]></tag>
        </tags>
    </post>
    ...
</twinglydata>

Breakdown of the elements and attributes:

<twinglydata numberOfMatchesReturned="1000" secondsElapsed="0.148" numberOfMatchesTotal="19017">
<post contentType="blog">
<url>http://oppogner.blogg.no/1409602010_bare_m_ha.html</url>
<title><![CDATA[ Bare MÅ ha! ]]></title>
<summary><![CDATA[Ja, velkommen til høsten ...]]></summary>
<languageCode>no</languageCode>
<published>2014-09-02 06:53:26Z</published>
<indexed>2014-09-02 09:00:53Z</indexed>
<blogUrl>http://oppogner.blogg.no/</blogUrl>
<blogName><![CDATA[ oppogner ]]></blogName>
<authority>1</authority>
<blogRank>1</blogRank>
<tags>
    <tag><![CDATA[ Blogg ]]></tag>
</tags>

Caching

Search responses are cached for 5 minutes, meaning that you will need to wait at least 5 minutes to get fresh results for a given search query. The cache key is the digest of the search pattern and the parameters; if in need to circumvent the cache it is possible to, for instance, change the timestamps slightly.


Pagination

If numberOfMatchesTotal is greater than numberOfMatchesReturned, then you will need to paginate through the result in order to retrieve all posts. The best way to do this is to utilize the ts and tsTo parameters, creating a sliding time-based window.

Set ts to the published time of the newest returned post and repeat your query. Repeat until numberOfMatchesTotal equals numberOfMatchesReturned. An example of this technique can be seen in the examples for the Search API Ruby client.

For the (unusual) case where your search query yields a numberOfMatchesTotal greater than 1,000 (the default page size), where all posts are published at the same time, you can increase numberOfMatchesReturned by adding the page-size option to your search pattern.

Example search pattern with page-size:

"christmas page-size:5000"

If you need to keep the API response small, it’s possible to paginate through up to 10,000 matches, with the page option added to your search pattern.

Example search pattern with page and page-size:

"christmas page:2 page-size:100"

Best practices


Known issues


Clients


Documentation changelog

API changelog