Data Access

Overview

This section sets out Reclink’s approach to providing data to partners, funders and researchers. Data will generally not be provided outside the arrangements set out here. There is a cost associated with the provision of data and this will be discussed at the time we discuss the data requirements for your project or funding.

Data Access Agreement

Prior to being provided with data by Reclink you will be required to sign our data access agreement. A copy of the agreement will be provided when we discuss your data needs.

Data Lake

Reclink maintains a “data lake” which contains cleaned data on participants, programs, surveys and related data. All data in the data lake is de-identified. No identified data will be provided to 3rd parties. No direct access to the data lake is provided to 3rd parties.

Data Tiers

Reclink provides data in 4 tiers to 3rd parties:

  • Tier 1: This is data in the form of reports and analysis. That is, Reclink conducts the analysis to agreed requirements and at agreed cost and provides the end results to you;
  • Tier 2: Summarised and aggregated data provided in a DuckDB database provided from Mother Duck. You are responsible for accessing the data using your own tools;
  • Tier 3: Non-summarised data provided in a DuckDB database provided from Mother Duck. You are responsible for accessing and analysing the data using your own tools;
  • Custom Access: On rare occasions we may, with prior agreement, provide access to extended datasets or specific aggregations. We are generally reluctant to do this and it will be charged at agreed cost if we do agree to provide it.

DuckDB and Mother Duck

DuckDB and Mother Duck are our standard way of providing data. You can directly address DuckDB using, for instance, a Python package, an R package or a CLI tool provided by DuckDB. There are numerous other extensions allowing access from, among others, PowerBI and Tableau. You can also attach to DuckDB databases in Mother Duck using the Mother Duck extension together with a local DuckDB instance. Querying is with an extended set of SQL which is well documented on the DuckDB and Mother Duck sites. Reclink does not provide support for writing SQL queries to address the data. All access is read-only, requiring you to perform any write operations locally.

Data Documentation

Reclink maintains data documentation to help users understand the data they are working with. This is provided in the form of a website. It details each table available, provides details of the columns in the data and shows the lineage for the tables and, where appropriate, the columns.