Deploying RSpace

The following information is only applicable to a Team or Enterprise Editions of RSpace.

Scope and intended audience

This document is for IT staff at an organisation that is considering purchasing, or has purchased, RSpace Team or Enterprise.

For Enterprise, this document aims to provide an overview of deployment options to help guide your decision on how to deploy RSpace.

Which Edition should I choose and where will RSpace be deployed?

This is a fundamental decision for you to make, and there are various options, all of which can be supported and have proven to be workable with existing customers.

For deployments of RSpace with less than 15 users who have selected RSpace Team Edition, only option 1 is available. However, this document highlights various information about the SaaS offering that might be useful to the IT staff of an organization that has opted for the Team Edition.

Option 1: ResearchSpace operates RSpace as a SaaS offering installed on an AWS server that we manage for you.

In this scenario, RSpace is deployed on your own private AWS instance in the cloud. The cost of the AWS instance with unlimited data storage is included in your purchase. Installation, backup, updates and maintenance are all performed by ResearchSpace. For Team edition, ResearchSpace will select a data center location for you from the following list:

  • us-east-1 virginia
  • us-west-1 california
  • eu-west-1 ireland
  • eu-west-2 london
  • eu-central-1 frankfurt
  • eu-central-2 zurich
  • eu-south-1 milan
  • ap-northeast-2 seoul
  • ap-southeast-2 sydney

Organizations who purchase RSpace Enterprise edition may request one of these locations or any other AWS regional data center necessary to meet your data storage and regulatory requirements. Hosting on AWS is good option if you:

  • Are unable or unwilling to dedicate staff time to installing and maintaining RSpace.
  • Want complete convenience and to get up and running quickly.
  • Anticipate expanding usage over time but do not have suitable resources of your own to accommodate this - AWS is essentially is unlimited in terms of data storage capability.

Option 2. (Requires RSpace Enterprise Edition). ResearchSpace remotely installs RSpace within a server you provide (powered by RSpace on Docker)

In this scenario, RSpace is installed on your institutional system - either physical machines, virtual hosts or an existing private cloud instance that you either own and manage. The server must meet the requirements listed here. We will install RSpace on your server running on Docker and is the recommended way to install RSpace on your own institutional system. This scenario is good if:

  • Your data-compliance guidelines require you to store research data on-premises.
  • You will still get all the benefits of RSpace Support, ensuring a reliable and stable RSpace deployment on your server.
  • You have staff with IT experience in managing servers in a Linux environment and with sufficient bandwidth. For example:
    • RSpace version updates are not performed automatically, instead our support team schedule a maintenance slot with the institutional IT team.
    • The institutional IT team will be responsible for backups and for keeping the underlying linux operating system up-to date
    • RSpace support requires a SSH connection to your institutional server in order to install, maintain and update the RSpace system.
  • You want absolute ownership and control over all aspects of the data life cycle, for example backup and recovery.

If your organization has opted to manage your Enterprise Edition server, then it is absolutely critical that you assign appropriate IT resources to keep the server properly maintained. Additionally you MUST be able to support timely remote examination of your server by ResearchSpace support specialists as needed. If you allow you server to fall more than 4 major releases or one year behind the current release, and you need to request ResearchSpace assistance with a complex, multistep, server update or repair, then additional surcharges will apply.

Ultimately it is the customer's responsibility to make sure that you are using the latest version of RSpace. You can see the version of your RSpace server at the bottom left of the interface. It will be of the form 1.x.x. You can see the current release version in the changelog. Be sure to have your IT staff respond promptly to periodic, update-related requests from ResearchSpace.
Before installing or updating any on-premise RSpace server, always refer to the version-specific documentation files (e.g. RSpaceConfiguration.md) included with the download bundle for your version. Contact ResearchSpace for access to the download URL. For security reasons, you will not find those detailed instructions anywhere in our online documentation. Installation / update instructions are only available within the download bundle, and installation / update should only be attempted by qualified IT professionals.

Option 3. (Requires RSpace Enterprise Edition). ResearchSpace remotely installs RSpace within a server you provide

In this scenario, RSpace is installed on your institutional system - either physical machines, virtual hosts or an existing private cloud instance that you either own and manage. The server must meet the requirements listed here. We will install RSpace on your server running on a LAMA stack. This scenario is good if:

  • Your data-compliance guidelines require you to store research data on-premises.
  • You DO NOT want RSpace to run on top of Docker on your server. Instead RSpace will be installed on a LAMA stack, which consists of:
    • Linux (Ubuntu)
    • Apache2
    • Maraidb
    • Apache Tomcat
  • You will still get all the benefits of RSpace Support, ensuring a reliable and stable RSpace deployment on your server.
  • You have staff with IT experience in managing servers in a Linux environment and with sufficient bandwidth. For example:
    • RSpace version updates are not performed automatically, instead our support team schedule a maintenance slot with the institutional IT team.
    • The institutional IT team will be responsible for backups and for keeping the underlying linux operating system up-to date
    • RSpace support requires a SSH connection to your institutional server in order to install, maintain and update the RSpace system.
  • You want absolute ownership and control over all aspects of the data life cycle, for example backup and recovery.

RSpace connectivity

RSpace does not operate in isolation from your institutional data; in fact it shines when connecting and linking your research work together. In this section we review how the different deployment options described above affect these aspects of RSpace functionality.

Single Sign On

If you want your users to access and login to RSpace using Single Sign On, RSpace supports this for RSpace Enterprise Edition using the SAML2 and OpenID Connect protocol. Most Identity Providers (IdPs) such as Okta, Azure AD etc., support this protocol. For more details, see Setting up SingleSignOn authentication

Connecting to your existing data storage

RSpace can store and manage all sorts of data files, but there are occasions when your researchers will want to link to data files on an institutional file server rather than bringing the files into RSpace. This might be the case if

  • The data files are huge, e.g. large images or sequencing files.
  • Your data has to be stored on a particular file server for compliance reasons.

RSpace can talk to these servers using either Samba or SFTP protocols. It just requires read access to list files to link to.

This can be easier to set up for an on-prem installation; connecting from RSpace on AWS is entirely possible technically, but requires access from RSpace to your file-server. Please read Configuring Institutional File Systems for more details

RSpace integrations

RSpace has integrations with many popular applications including Dropbox, Google Drive, OneDrive, Microsoft Teams, Slack, Office 365, protocols.io, Github, Figshare and Dataverse- see Integrations for a full list. The setup required for each integration is variable. If you are running RSpace as a SaaS (option 1 above) , ResearchSpace will be able to set up these integrations for you. If you are running RSpace on-prem (or, more specifically, the RSpace URL is not a researchspace.com URL), then you will have to configure these integrations, as they often require proof of domain ownership to set up (e.g. Google Drive).

Some integrations require the integration to successfully communicate back to RSpace, so some firewall configurations might block this. Our team will be able to guide you through the steps needed to get the integration working if this is the case.

RSpace Enterprise customers can customize the interface by adding an organizational branding / company logo image to the top right corner of the interface (replacing the standard RSpace image), and / or by adding up to 2 other custom text links in the page footer (e.g., pointing to a web page you maintain with information about data privacy policies, legal disclaimers, or other important information about using RSpace at your specific organization).

Getting data out of RSpace

RSpace supports export to all standard formats - HTML, XML, PDF, Microsoft Word and JSON (via RSpace API). Users can export their data themselves, at any level of granularity from a single document to their entire body of work, at any time, and download the export to their own machines. Exports can be scheduled using the API - e.g. running a cron-job to invoke export once a week.

If as a server administrator you want to do low-level data export, this is easily accomplished using standard, free tools. ELN metadata can be exported from the MySQL database using `mysqldump` or `Percona XtraBackup`), and from its internal file store via tools such as `rsync`.

No data is stored in a binary format proprietary to RSpace.

Sensitive data

Standard on prem and hosted RSpace deployments are not appropriate for entry of sensitive data (e.g., patient information subject to HIPAA or similar regulatory rules).   It is certainly possible, however, to deploy RSpace so that entry of sensitive data is supported and compliant.  Often, this issue comes up where usage in a medical school is planned.   In these situations, a solution is to deploy RSpace within a validated compliant environment you already use or that you create with assistance from ResearchSpace. Because of the increased cost of data storage and processing in these environments, it may even make sense to deploy a second instance of RSpace specifically for researchers who handle sensitive data, and researve your standard RSpace deployment for use by the majority of users, who don't need the extra 'compliance wrap'.

In the USA, AWS GovCloud offers a compliant computing environment for organisations bound by federal data-handling regulations. RSpace has been installed successfully in this environment.

Related: our privacy policy.

Migration after a pilot

Customers often run a pilot of RSpace on AWS, before deciding to purchase an ongoing license. In that case you can decide whether to continue using the cloud instance as a production instance, or switching to an on-premises deployment. If you chose to move to an on-premises deployment, it's possible to migrate data that researchers entered into the cloud instance to the on-premises instance of RSpace.

Data Backup

For on-premise deployments of RSpace, backup is solely the customer’s responsibility. We will consult with your IT personnel at the time of deployment. For backing up AWS-based RSpace instances, ResearchSpace uses scripts to automate the backup process that we are happy to share with customers on request.

When deployed as SaaS (software as a service) onto an Amazon Web Services (AWS) private instance that we manage for you, ResearchSpace and Amazon take care of backups for you.

Data is stored in a MySQL 5.7 or MariaDB10.3 database; files are stored unmodified on EBS volumes in a directory structure.

  • We make hourly file syncs to S3 using AWS CLI tool
  • Nightly and weekly snapshots of instances and data volumes are stored as machine images (AMIs). These are fast to make, and support RTOs in the order of minutes.
  • Logical database backups are made nightly, and stored on S3. Data Files, logs, configuration files and search indices are additionally synced to S3 hourly.

For in-depth description, please read  SaaS Backup document.

In addition, customers can use the export API endpoint to make additional, scheduled, bulk data exports to any destination you like to act as an additional redundant data backup.


How did we do?


Powered by HelpDocs (opens in a new tab)