2.2. API Reference for Sites

We provide a basic API for sites. We use the term ‘sites’ to refer to search engines that dedicate a (small) part of their traffic to evaluating runs from participants from TREC OpenSearch. This API can be used by sites to update the query set, the documents and to retrieve rankings. For each retrieved ranking, the site is expected to provide feedback. Everything is implemented as HTTP request, and we use the request types GET, HEAD and PUT. We try to throw appropriate 4XX errors where possible.

For all operations, an API key is required. This key is supplied as username via HTTP basic authentication. The password should be left empty. Our API is located at http://api.trec-open-search.org/api/v2 .

Note

We have rate limited the API to 300 calls per minute or 10 calls per second, whichever hits first. Please do let us know if this is causing you any problems.

2.2.1. Query

From each site, it expected to receive a static sample of (N=100) queries at the beginning of the challenge. The sample is static in the sense that it will not change during the challenge. It is important that the sample of queries is expected to be frequent enough for the duration of the challence. The least frequent (tail) queries are not very useful for they challenge as they will not be issued often enough.

This endpoint provides ways to manipulate the set of queries before the challenge starts.

PUT /api/v2/site/query

Update (or initialize) the query set. This can only be done before the challenge started.

Per query, you can mark its type: whether the query is supposed to be a train, test or eval query. Test queries are supposed to not be evaluated online. So, participants will (should) not expect any feedback for queries other than train queries (during the training phase). In fact, we may return an error when you try to return feedback for another query. The default query type is “train”, which is thus also used when the type is omitted.

Optionally, the “qid” (i.e., not the site_qid) can also be added to each query. If this is not done, the query is assigned an automatic qid.

Parameters:
  • key – your API key
Request Headers:
 
Content:
{
    "queries": [
        {
            "qstr": "jaguar",
            "type": "train",
            "site_qid": "48474c1ab6d3541d2f881a9d4b3bed75"
        },
        {
            "qstr": "apple",
            "type": "train",
            "site_qid": "30c6677b833454ad2df762d3c98d2409"
        }
    ]
}
Status Codes:
Return:

see GET /api/site/query/(key).

DELETE /api/v2/site/query

Delete the query set. This can only be done before the challenge. started.

Parameters:
  • key – your API key.
Status Codes:
GET /api/v2/site/query

Obtain the query set.

Parameters:
  • key – your API key
Status Codes:
Return:
{
    "queries": [
        {
            "creation_time": "Sun, 27 Apr 2014 13:46:00 -0000",
            "qstr": "jaguar",
            "type": "train",
            "site_qid": "48474c1ab6d3541d2f881a9d4b3bed75",
            "qid": "S-q1",
        },
        {
            "creation_time": "Sun, 27 Apr 2014 13:46:00 -0000",
            "qstr": "apple",
            "type": "test",
            "site_qid": "30c6677b833454ad2df762d3c98d2409"
            "qid": "S-q2",
        }
    ]
}

2.2.2. Doclist

Per query, the challenge will provide a preselected doclist of (M=100) documents to the participants. The selection criteria are up to the site.

As documents to be considered for a query may change over the course of the challenge, the API provides an endpoint at GET /api/site/doclist to keep the doclist up to date.

PUT /api/v2/site/doclist/(site_qid)

Update the document list for a query.

The doclist defines the set documents that are returnable for a query. The documents in the list are expected to be uploaded before you update this list. Deleting individual documents is possible but not necessary. It is the doclist that matters.

Parameters:
  • key – your API key
  • site_qid – the site’s query identifier
Request Headers:
 
Content:
{
    "doclist": [
        {"site_docid": "4922d3c4fdb24296a90a20bdd20e"},
        {"site_docid": "af1594296a90da20bdd20e40e737"},
        {"site_docid": "b5ee9b2e327493c4fdb24296a94a"},
            ]
}
Status Codes:
Return:

see GET /api/site/doclist/(key)/(site_qid)

GET /api/v2/site/doclist/(site_qid)

Retrieve the document list for a query.

This doclist defines the set documents that are returnable for a query. You are free to update this list when the set of documents changes over time.

Parameters:
  • key – your API key
  • site_qid – the site’s query identifier
Status Codes:
Return:
{
    "doclist": [
        {"site_docid": "4922d3c4fdb24296a90a20bdd20e"},
        {"site_docid": "af1594296a90da20bdd20e40e737"},
        {"site_docid": "b5ee9b2e327493c4fdb24296a94a"},
            ]
}

Some sites may define “relevance_signals” for each document in this list.

2.2.3. Doc

The endpoint at GET /api/site/doc can be used to update content of individual documents.

PUT /api/v2/site/doc/(site_docid)

Store a single document. Feel free to use fields (such as ‘description’ in the example) if you have them. You are free to use any document identifier you wish (be it a url, a hash of the url, or anything else you use internally).

Parameters:
  • key – your API key
  • site_docid – the site’s document identifier
Request Headers:
 
Content:
{
     "content": {"description": "Lorem ipsum dolor sit amet",
                 "short_description" : "Lorem ipsum",
                 ...}
     "site_docid": "b59b2e327493c4fdb24296a90a20bdd20e40e737",
     "title": "Document Title"
}
Status Codes:
Return:

see GET /api/site/doc/(key)/(site_docid)

DELETE /api/v2/site/doc/(site_docid)

Delete a single document.

Make sure to first update the doclist. In fact, deleting a documents is not required after updating the doclist.

Note, documents are not really deleted. Rather, a “deleted” flag is set. This is done to avoid asigning a new internal docid when the document would be uploaded again.

Parameters:
  • key – your API key
  • site_docid – the sites document identifier
Status Codes:
  • 200 OK – the document is deleted
  • 403 Forbidden – invalid key
  • 404 Not Found – document does not exist
  • 409 Conflict – document can not be deleted, it still appears in a doclist for a query (the queryid will be returned).
GET /api/v2/site/doc/(site_docid)

Retrieve a single document that was uploaded before. Identify it with your own identifier.

Parameters:
  • key – your API key
  • site_docid – the sites document identifier
Status Codes:
Return:
{
     "content": {"description": "Lorem ipsum dolor sit amet",
                 "short_description" : "Lorem ipsum",
                 ...}
     "creation_time": "Sun, 27 Apr 2014 23:40:29 -0000",
     "site_docid": "b59b2e327493c4fdb24296a90a20bdd20e40e737",
     "title": "Document Title"
}

2.2.4. Ranking

GET /api/v2/site/ranking/(site_qid)

Obtain a ranking for a query.

Every time this endpoint is called, a ranking produced by participants of the Challenge is selected based on a least-served basis. Due to this behavior, the ranking is likely to change for each call. Therefor, the site should perform caching on their own in order to show users stable rankings for repeated queries.

The API will ensure that only documents that are presented in the most recent doclist for the requested query are returned. Sites are not expected to filter the ranking. If filtering is required for this query, please do so by updating the doclist. While we should aim to prevent this, it may happen that the site needs to make a last minute decision not to include a certain document. Make sure to incorporate this decision in the feedback.

The site is expected to expose the retrieved ranking to a user and return user feedback PUT /api/site/feedback/(key)/(sid) as soon as it is available.

Note

Note the session id (sid) which will need to be stored on the sites end and should be returned as part of the feedback.

Parameters:
  • key – your API key
  • site_qid – the site’s query identifier
Status Codes:
Return:
{
    "sid": "s1",
    "doclist": [
        {"site_docid": "4922d3c4fdb24296a90a20bdd20e"},
        {"site_docid": "af1594296a90da20bdd20e40e737"},
        {"site_docid": "b5ee9b2e327493c4fdb24296a94a"},
            ]
}

2.2.5. Feedback

PUT /api/v2/site/feedback/(sid)

Store user feedback for a session obtained through GET /api/site/ranking/(key)/(site_qid). The feedback can be stored multiple times for the same session if more feedback comes available. In that case, the old feedback will be overwritten, it is not additive. So if multiple clicks come in one by one, make sure to include all of them each time you update the feedback.

Note

It is expected that the doclist is the actual doclist that was shown to the user. This is important because the site may have had to make a last minute decisions not to include a certain document. It not obtaining a click is valuable information for a participant.

Parameters:
  • key – your API key
  • sid – the session’s identifier
Request Headers:
 
Content:
{
    "sid": "s1",
    "site_qid": "48474c1ab6d3541d2f881a9d4b3bed75",
    "type": "clicks",
    "doclist": [
        {
            "site_docid": "af1594296a90da20bdd20e40e737"
            "clicked": true,
        }, 
        {"site_docid": "b5ee9b2e327493c4fdb24296a94a"},
        {"site_docid": "4922d3c4fdb24296a90a20bdd20e"},
        ]
}

In case Team Draft Interleaving was performed, this should be encoded as follows.

Content:
{
    "sid": "s1",
    "site_qid": "4922d3c4fdb24296a90a20bdd20e",
    "type": "tdi",
    "doclist": [
        {
            "site_docid": "b5ee9b2e327493c4fdb24296a94a"
            "clicked": true,
            "team": "site",
        },
        {
            "site_docid": "af1594296a90da20bdd20e40e737"
            "clicked": true,
            "team": "participant",
        },
        ]
}

Historical feedback can be added through GET /api/site/historical/(key)/(site_qid).

Status Codes:

2.2.6. Historical Feedback

PUT /api/v2/site/historical/(site_qid)

Store historical user feedback for a query. This is different from live feedback, that can be stored through PUT /api/site/feedback/(key)/(sid).

The feedback can be stored multiple times for the same query, the old version will be overwritten, it is not additive.

Note

It is expected that the doclist is the actual doclist that was shown to the user. In case multiple doclists were shown for the same query, an average ranking may be returned (documents sorted by average rank).

Parameters:
  • key – your API key
  • site_qid – the sites query identifier
Request Headers:
 
Content:
{
    "type": "ctr",
    "doclist": [
        {
            "site_docid": "b5ee9b2e327493c4fdb24296a94a"
            "clicked": 0.7,
        },
        {
            "site_docid": "4922d3c4fdb24296a90a20bdd20e"
            "clicked": 0.4,
        },
        ]
}
Status Codes:
GET /api/v2/site/historical/(site_qid)

Retrieve historical user feedback for a query. This is different from live feedback, that can be retrieved through GET /api/site/feedback/(key)/(sid).

Parameters:
  • key – your API key
  • site_qid – the sites query identifier, or “all”
Request Headers:
 
Status Codes:
Return:
{
    "type": "ctr",
    "doclist": [
        {
            "site_docid": "b5ee9b2e327493c4fdb24296a94a"
            "clicked": 0.7,
        },
        {
            "site_docid": "4922d3c4fdb24296a90a20bdd20e"
            "clicked": 0.4,
        },
        ]
}