Twitter
Google plus
Facebook
Vimeo
Pinterest

Fluid Edge Themes

Title Image

Data Science

Data Management for Data Science

Rok empowers faster and easier collaboration among users working with data science environments on-prem or on the cloud.

Versioning

Store immutable versions of your whole environment along with its datasets. Roll back to any point you like, and instantly clone the preferred version. Start treating the whole environment the way you are now treating your code.

Packaging

Package everything together and add user-provided or automatically-generated metadata to your packages, so you or a colleague can deploy your whole machine learning environment to any other platform, running anywhere in the world. Instantly.

Reproducibility

Keep track of your versions, their history and associations, and recreate your complete environment exactly the way it was at any point in time, without searching for missing outputs and lost temporary datasets. Enable end-to-end auditable processes for your work.

Deduplication

Store only what has changed. Rok detects the parts that have remained unchanged between versions and only stores them once. This way, you make the most efficient use of your underlying storage capacity.

Collaboration

Make your whole environment reproducible and available to others working on different infrastructure anywhere in the world. A whole new way for teams to collaborate and iterate, faster and easier than ever.

Security

Your data and data science environments are shared over encrypted, point-to-point connections, which you can opt to run over your private network. Rok allows for encrypting all data at rest. Your data never crosses or gets stored at a central point.

Integration

Rok runs everywhere, so you can continue using your laptop, any public cloud, or your existing on-prem virtualization or container platform. You can now be sharing across VMware, Kubernetes on-prem, EKS on AWS, or GKE on GCP.

Execution

Run fast! Run your data science environment on any kind of primary storage you like. Rok now makes it viable to run over super-fast, cost-efficient, ephemeral, local NVMe or traditional SSD storage. Rok sits on the side of your primary storage, not in the critical I/O path.

Distribution

Sync Faster! Sync your environments with your peers without needing to push to and pull data from a central repository. Feel the power and efficiency of the distributed, peer-to-peer Rok Network. The future is decentralized.

Use Rok and Rok Registry to share whole data science environments (code + libs + data) with hundreds of collaborators, across any location, on-prem or on the cloud.