3.1. API Installation Tutorial

Note

You may not need to install the API yourself, read the guide.

This tutorial describes how to install, run and develop with the Living Labs API.

If all you want is to participate in the Lab, you do not necessarily need any of the following. Instead you could just go ahead and implement your client that talks to our API at http://api.living-labs.net/api/ (for the CLEF competition) or http://api.trec-open-search.org/api (for the TREC OpenSearch competition). However, the code we provide does include a simple baseline implementation that talks to our API and that you may find useful. Furthermore, if you do install the API/dashboard/.. on your own machine, debugging your code will become much easier.

In case you have any comments or questions, please do not hesitate to file an issue here: https://bitbucket.org/living-labs/ll-api/issues. Or, you can contact the main developer directly at anne.schuth@uva.nl.

Documentation (including this tutorial) can be found here: http://doc.living-labs.net/en/latest/

3.1.1. Obtain source code

You can clone the repository that contains all the Living Labs API’s code as follows:

$ git clone https://bitbucket.org/living-labs/ll-api.git

In case you plan on making changes, please first make a fork through the bitbucket interface and then clone your own fork. That way, you will be able to push your changes and to ask for a pull request so that your changes can be merged back.

3.1.2. Install prerequisites

Our code is Python 2.6/2.7 code. It definitely won’t run on Python 3.x, and most likely not on earlier versions of Python. If you want to run the API yourself or if you want to run pre-packed clients that communicate with an API, then a couple of prerequisites are needed. However, installing them is easy (if you have pip installed):

$ sudo pip install -r requirements.txt

If you don’t have pip yet, install it using easy_install pip. Windows users may want to read here: http://stackoverflow.com/questions/4750806/how-to-install-pip-on-windows

You may need to install the python-dev package. And it sometimes happens (for instance on Windows), that you need to install Numpy/Scipy manually first.

3.1.3. Done?

If you only want to run a client, you have all you need. Clients are pieces of code that talk to the Living Labs API. We recognize two types of clients: participants and sites. Example clients are in the repository in the ll/clients directory. See Running Clients.

In case you want to run your own version of the API (for testing purposes), you’ll have to continue.

You don’t necessarily have to do that, our API is running here: http://api.living-labs.net/api/ (for CLEF) or http://api.trec-open-search.org (for TREC OpenSearch).

3.1.4. Setup MongoDB

If you don’t already have MongoDB, you may follow a guide for your operating system at this page: http://docs.mongodb.org/manual/installation/. You’ll need MongoDB version >=2.6.

Then you can choose to run with or without authentication (without is easier, but unsecure).

Either way, move to the ll-api directory:

$ cd ll-api

3.1.4.1. Authenticated

To run MongoDB with authentication enabled you can run it with the provided configuration file config/mongodb.conf (you may have to edit the data path).

First start a MongoDB daemon as follows:

$ mongod --config config/mongodb.conf

Now, use the admin tool to create a database user and database admin, which you will need later to access the database. Note that these database users are different from the LivingLabs users you are going to create later.

Replace USERSECRET and ADMINSECRET by your desired user and admin passwords and remember them.

$ ./bin/admin db --setup-db-users --mongodb_db ll --mongodb_user ll --mongodb_user_pw USERSECRET --mongodb_admin admin --mongodb_admin_pw ADMINSECRET

Now, we use the admin tool to generate a configuration file containing the database username and password, which we will need later. Again, replace the passwords!

$ ./bin/admin db --export-conf-file config/db.ini --mongodb_db ll --mongodb_user ll --mongodb_user_pw USERSECRET

The tool will export the database username and password to the db.ini file. Remember to never add this file to a code repository, that would be a severe security threat.

3.1.4.2. Non-Authenticated

For developing purposes, this is fine. Otherwise, make sure to use authentication. Start a MongoDB deamon as follows:

$ mongod

3.1.5. Run the API

We make a local copy of the user settings file, config/livinglabs.ini, so it is safer to make changes to it later:

$ cp config/livinglabs.ini config/livinglabs.local.ini

Furthermore, there is a configuration file ll/core/config.py, which stores API constants like web adresses, e-mail adresses and competition deadlines. The most recent version of this file from the repository contains the most recent information of official TREC OpenSearch rounds, you can change this on your own server if necessary.

To start the API, run the following command (without config/db.ini for an unauthenticated setup):

$ ./bin/api -c config/livinglabs.local.ini config/db.ini

If you want to automatically have the API reload when you change the code (which is incredibly handy when developing) then run this with --debug the debug flag (without config/db.ini for an unauthenticated setup):

$ ./bin/api -c config/livinglabs.local.ini config/db.ini --debug

In general, use --help or -h for more information.

3.1.6. Fill the Database

To fill the database with a standard configuration, including clients and sites, a fixture is available in the dump directory. We use the admin tool to import this fixture (without -c config/db.ini for an unauthenticated setup):

$ ./bin/admin db --import-json dump/ -c config/db.ini

We want to check that the users have been created. Users are clients and sites connecting to the LivingLabs API and should not be confused with the database users created in the Setup MongoDB section. To show all users (clients and sites), issue the following command (without -c config/db.ini for an unauthenticated setup):

$ ./bin/admin user -c config/db.ini --show

You will see the following:

E0016261DE4C0D61-M6C4AMHHE4WV4OVY uva test@example.com SITE
9EA887B684DD5822-JBB2XOCVEGYE7YAZ user1 test1@example.com PARTICIPANT ADMIN
77DBF9C7A1F70422-EZICBLYSCMMBJWKR user2 test2@example.com PARTICIPANT
  • uva is a site, with sitepass as its standard password.
  • user1 is a verified participant, which means it has been authorized to connect with sites via the Dashboard. user1 is also an admin user, so you can use it to change global settings on the Dashboard. Its password is partpass.
  • user2 is an unverified participant, it still has to be verified via the Dashboard by an administrator. The standard password for user2 is part2pass.

The user e-mail adresses, combined with the mentioned passwords, can be used to log in to the Dashboard. On the dashboard, you can also change the passwords.

Remember the keys as well, you will need them when creating clients in section Running Clients.

3.1.7. Running Clients

Clients are pieces of code that talk to the Living Labs API. We recognize two types of clients: participants and sites. Sites are search engines that share queries, documents and clicks. Participants rank documents for queries using clicks. Clients need API keys. You can use the keys obtained in the Fill the Database section or look them up via the Dashboard.

3.1.7.1. Run a Site

To run a site client and upload queries and documents, you can do the following:

$ ./bin/client-site --host localhost --key SITEKEY -q -d

This assumes the API runs on localhost, your own computer. If the --host argument is omitted, a default online API (specified in ll/core/config.py) is used.

It will take TREC queries/runs/document (see -h for file locations and how to change them) as a basis. By default, train queries are added. If you want to upload test queries, use --query_type test. Alternatively, with the --letor switch, this client will accept Learning to Rank (Letor) data.

Then, to simulate interactions, run the following:

$ ./bin/client-site --host localhost --key SITEKEY -s

Again, this will take TREC data (qrels) to simulate clicks using a simple cascade click model. Or, again, with the --letor switch, a Learning to Rank (Letor) data set.

The simple simulator will print the NDCG value of all the rankings it receives from the API.

Note that the site client is not at all aware of the participants, the site client simply talks to the API. So if there are multiple participant clients present, the API does not know about this and the NDCG will thus reflect the average performance of all participants. This is by design. For per-participant statistics, one should use the Dashboard.

If you want to run multiple sites, you should create multiple keys and start multiple instances that talk to the same API.

For your convenience, you can download learning to rank (Letor) data sets here:

3.1.7.2. Run a Participant

To run a simple participant implementation, you can do this, again assuming the API runs on localhost:

$ ./bin/client-participant --host localhost -k PARTICIPANTKEY -s

The API key can be obtained through a procedure explained in Fill the Database or through the Dashboard.

This will run a baseline system that simply greedily reranks by the number of clicks. Note that you may need to specify the host/port where the API is running (see -h for details on how to do that).

If you want to run multiple participants, you should create multiple keys and start multiple instances that talk to the same API.

3.1.8. Dashboard Installation

Note

You may not need to install a Dashboard yourself, read the guide.

If you are running a local version of the API for development, it is a good idea to also run a dashboard with it.

To start the dashboard, fill out the dashboard fields in the local copy of the general LivingLabs configuration file (config/livinglabs.local.ini). In particular, you will need a recaptcha key (see http://www.google.com/recaptcha), that will fill the recaptchaprivate and recaptchapublic fields. csrfsecrettoken and secretkey are both random strings you should generate.

Then run the following command (without config/db.ini for an unauthenticated setup):

$ ./bin/dashboard -c config/livinglabs.local.ini config/db.ini

In general, use --help or -h for more information. By default the dashboard will run on port 5001.

On the Dashboard, you can log in using the users created under Fill the Database. You can also create new users using the Register button. As a participant, you can use the Dashboard to add yourself to one or more sites. If you are an admin, you can verify participants, so they are able to connect with a site.