BBC News Labs Documentation

Documentation for BBC News Labs APIs & #newsHACK events

Juicer v1 API Documentation

DEPRECATED! Please use the new Juicer v2 API

About the Juicer

The BBC Labs News Juicer is the software responsible for fetching content from the web - articles, videos and images from the BBC and other sources - parsing the content and tagging it with people, organisations and places that correspond to DBpedia entries.

The public Juicer API allows you to specify start and end dates for articles, select which sources or concepts you wish to search for and to free text search in real time, as articles, tweets and other content is ingested.

Read more about the Juicer

Juicer Client Libraries

If you are not using Ruby or Node.JS (or just want to use the REST API directly) see the documentation for REST API endpoints below.

Juicer Postman Collection

If you install the free Postman REST Client you can import the Juicer API into using this URL:

This allows you to easily try out the API and explore how it works without writing any code.

You will need to configure an "Environment" within Postman to be able to make calls:

REST API Endpoints

1. Get Articles



You can filter and sort the articles returned from the articles endpoint. Parameters can be provided in the query string.

Available parameters:

  • text: keywords to search for. Searches in title, description and body of the article.
  • product[]: scopes the results to certain products. Multiple products can be specified by adding multiple product[] keys and values. Just specify 'NewsWeb' to return only BBC News articles.
  • content_format[]: scopes the results to certain formats. Multiple formats can be specified by adding multiple content_format[] keys and values.
  • section[]: scopes the results to certain sections. Multiple sections can be specified by adding multiple section[] keys and values.
  • site[]: scopes the results to certain sites. Multiple sites can be specified by adding multiple site[] keys and values.
  • published_after: fetch articles published after published_after. The date format is yyyy-mm-yy.
  • published_before: fetch articles published before published_before. The date format is yyyy-mm-yy.
  • recent_first: Specify 'yes' to sort results by date (with most recent first) instead of by relevance to keywords.

All parameters are optional. If parameters are omitted, the endpoint will just return the latest articles.

Example Request[]=NewsWeb&content_format[]=TextualFormat&recent_first=yes&apikey={{apikey}}

Complex Query Example

The Juicer supports doing complex queries across multiple sources. For example, the following query uses the search phrase kenya || nairobi AND (government || president || "Uhuru Kenyatta") which returns articles only from the listed products that match that specific query.[]=DailyNewsEgypt&product[]=KenyaBroadcastingCorporation&product[]=TechMoran&product[]=NigerDeltaStandard&product[]=NationalElectionCommissionSudan&content_format[]=TextualFormat&recent_first=yes&apikey={{apikey}}

2. Get Article

You can get the full text of a specific article by using it's id (this property is sometimes called the cps_id).


Example Request{{apikey}}

3. Get Products

Get the a list of products currently indexed and avalible from the Juicer.

"Products" are newspapers, broadcast TV channels and other sources.

To return only BBC News articles, specify product[]=NewsWeb when calling articles.json.


Example Request{{apikey}}

Questions and Feedback

All comments are moderated for approval.