The ultimate enterprise scientific computing platform, with batch scheduling and burst compute at enormous scale, and a lightning fast package and image builder.
Segregate between environments - dev/pre/prod
Run back-tests of your code against your data pipeline
Orchestrate and schedule batch jobs
Scheduling batch jobs at scale
Up to 10K jobs in 10 seconds in parallel and/or with dependencies
Create and schedule batch job DAGs programmatically via the GUI, Python API, or JSON
Deploy batch jobs on Datatailr with 3 lines of code. Just put what you want to run into __batch_main__ function.
Deploy services as you would deploy an application.
Jobs can be parameterized and created on the fly, as part of another, scheduled job
Customizable display; user settings saved between sessions
  • Flexible graph-based batch processing and orchestration
  • Automatically pre-warm machines
  • Automatically spin VMs up and down for enormous scale
  • Optimal memory and CPU allocationFailed jobs retried automatically until success
  • Optimal memory and CPU allocationFailed jobs retried automatically until success
  • Rerun only failed parts of batch runs
  • Full reproducibility to test your code: all batch jobs run the latest version of the image by default
  • When triggering a re-run of a previously executed batch, we can choose to run it with newer versions of the images
Batch jobs planning
Next run time is displayed for each scheduled job
When the expiration date is reached, batch job transitions to a 'stopped' state
Batch jobs visibility
  • Access to stdout and stderr of each job directly from the batch run view
  • Display each job's custom parameters
Gantt view of execution times
Every process is entitled in order to track and allocate costs by user, and set group limits
Customizable display; user settings saved between sessions

A fast and efficient Python build system to package and deploy applications
Seamless CI/CD Experience
  • Autobuild can be enabled for any package or image
  • Set it up once, and any push to the source repo will trigger a build of affected packages and images
  • Batch jobs use the latest version of an image, so you can always be sure what they're running with
  • Run automated user-defined tests against new image versions after they're built
Vectorized build system for packages and images
  • Build isolation using Docker
  • Caching individual build steps
  • ML packages are always available on the latest GPU hardware
  • Available to most key languages
Product Demo
The Ultimate SDLC
SDLC - Build Automation
  • Automatic based on git flow
  • Different rules can apply to different branches
  • Build multiple packages from one repo
  • Automatic update of all images and jobs that depend on a new version of a package
SDLC - Type Checker
Automatic type annotations out of the box
SDLC - Type Checker Dev Experience
Easily catch type errors
Type Checker Demos
Compare Datatailr package and image builder versus Conda
Feature
Conda
Datatailr package and image builder
System Package Integration
Requires duplication and maintenance of Linux distribution low-level packages
Packages integrate well with those provided by Linux system package manager; allows for greater focus
Dependency Management
Complex dependency graphs with version specifics. Longer build times and prone to errors (e.g., conda dependency resolution can fail after many hours of “building”)
Simplified, based on package names. Builds are typically 3x faster than with Conda-build and not prone to error.
Build Isolation      
Relies on environment management
Utilizes Docker for complete isolation
Caching Individual Build Steps             
Does not offer this level of granular caching
Repeated builds take advantage of Docker by caching individual build steps at Docker level, which can drastically cut down build times in CI/CD pipelines and avoid failed builds.
OS & Distribution Support
MacOS, Windows and Linux, excluding Alpine distribution due to lack of support for musl C runtime distributions
Linux only
Focus on Portability   
Ecosystem ported high number of system packages, but at the expense of developer effort
Low, focuses on leveraging existing system resources