Create Your Own Kvasira Integrations

Published 05 Nov 2019 by Kvasir Analytics

Introducing the KvasirA API


KvasirA is a service providing fast and powerful semantic search. You provide a document, and it quickly matches that document against one or more large document libraries, returning the documents closest in meaning. These document libraries are currently provided by us but soon you’ll be able to create and query your own private document libraries. If you aren’t already familiar with KvasirA, you can read more here.

The KvasirA search technology can be easily integrated into many different applications. We have developed and released some already, for Chrome, Firefox, Slack, and Microsoft Word making it simple for you to add KvasirA into your current workflows today.

All our integrations are produced using the APIs we provide to KvasirA, and we’re happy to announce that we can now provide developers means to leverage the power of KvasirA in their own applications through those APIs. In this post we’ll give an overview of the most important parts of the API: how to list and query the available public document libraries.

If you aren’t already familiar with HTTP methods and web APIs, you may wish to consult, for example, the MDN docs.

Listing available document libraries

To get the list of all currently available document libraries, we can perform an HTTP GET request to the address

https://demo.kvasira.com/api/libraries

If all goes well, KvasirA will return a JSON response similar to the one below:


{
  "data": [
    {
      "title": "Wiki (EN)",
      "id": "enwiki",
      "description": "English Wikipedia articles",
      "isPrivate": false,
      "running": true,
      "indexLanguage": "en",
      "queryLanguage": "multi",
      "token": "983ca6b5-2b3d-494e-9532-4bd64ec14b91",
      "faviconPath": "/favicons/Wikipedia's_W-8fcaedcf-5928-4de2-9338-f6d6a2698f3d.svg"
    },
    ...
    {
      "title": "ArXiv",
      "id": "arxiv",
      "description": "ArXiv papers",
      "isPrivate": false,
      "running": true,
      "indexLanguage": "en",
      "queryLanguage": "multi",
      "token": "b7a71570-a1d4-11e9-971f-f9c8a9c7b701",
      "faviconPath": "/favicons/index-1-e06b1eb0-a22f-11e9-971f-f9c8a9c7b701.png"
    }
  ]
}

The data field in the response will contain the list of available document libraries. In this case, our available document libraries are the English Wikipedia and ArXiv. Each document library in the list has the title, id and description of the library, in addition to other relevant information, including the language of the documents in the library. If running is true, that means the document library is up and ready to accept queries!


Querying a document library

We can query a document library by issuing an HTTP POST request to

https://demo.kvasira.com/api/library/LIBRARY_ID/query?query_type=[url|text]&k=N,

where LIBRARY_ID is the id of the document library we want to query, N is a parameter that indicates the desired number of results, and the query_type parameter specifies the query type. The query type can be either url, in which case KvasirA will use a URL as a query document, or text, in which case the query document should be provided as-is. The document is passed in the doc field as data to the POST request. The doc field should be a URL for the URL query type and the document itself if the query type is text.

Let’s try it by querying the Merge sort Wikipedia article against the English Wikipedia. For example, using curl this could be done with the command

curl \
  -X POST \
  -d '{"doc": "https://en.wikipedia.org/wiki/Merge_sort"}' \
  -H 'Content-Type: application/json' \
  'https://demo.kvasira.com/api/library/enwiki/query?query_type=url&k=3'

If the query is successful, the response will be similar to

{
  "response": {
    "time": 0.33982386589050293,
    "results": [
      {
        "title": "Merge sort",
        "id": 9604,
        "uri": "https://en.wikipedia.org/wiki/Merge_sort",
        "summary": "In computer science, merge sort (also commonly spelled mergesort) is an efficient, general-purpose, comparison-based sorting algorithm. Most implementations produce a stable sort, which means that the order of equal elements is the same in the input and output. Merge sort is a divide and conquer alg..."
      },
      {
        "title": "Sorting algorithm",
        "id": 13871,
        "uri": "https://en.wikipedia.org/wiki/Sorting_algorithm",
        "summary": "In computer science, a sorting algorithm is an algorithm that puts elements of a list in a certain order. The most frequently used orders are numerical order and lexicographical order. Efficient sorting is important for optimizing the efficiency of other algorithms (such as search and merge algorith..."
      },
      {
        "title": "Merge algorithm",
        "id": 9769,
        "uri": "https://en.wikipedia.org/wiki/Merge_algorithm",
        "summary": "Merge algorithms are a family of algorithms that take multiple sorted lists as input and produce a single list as output, containing all the elements of the inputs lists in sorted order. These algorithms are used as subroutines in various sorting algorithms, most famously merge sort. = Application =..."
      }
    ],
    "similarities": [
      0.18720299005508423,
      0.6811162233352661,
      0.6942106485366821
    ],
    "metadata": {
      "title": "Merge sort - Wikipedia",
      "uri": "https://en.wikipedia.org/wiki/Merge_sort",
    }
  }
}

The response field contains the results of our query. The actual query results are in a list in the results field inside response. Each result has a title, a URI, a summary and an id. In this case, our results are Merge sort, Sorting algorithm and Merge algorithm. The time field in response indicates how long the search took and metadata gives additional information about the query document such as the title of the page for a URL query.

If the query isn’t successful, KvasirA will return an HTTP status code that doesn’t equal 200. In this case the response field will contain a string describing the error message. For example, if the provided URL is not reachable, KvasirA will respond with a status code of 500 and the response message will be:

{
  "response": "Unable to extract content from page"
}

That’s it for now! In our next blog post we’ll take a look at building a complete integration using the KvasirA API.