The ultimate enterprise scientific computing platform, with batch scheduling and burst compute at enormous scale, and a lightning fast package and image builder.
Segregate between environments - dev/pre/prod
Run back-tests of your code against your data pipeline
Orchestrate and schedule batch jobs
Scheduling batch jobs at scale
Up to 10K jobs in 10 seconds in parallel and/or with dependencies
Create and schedule batch job DAGs programmatically via the GUI, Python API, or JSON
Deploy batch jobs on Datatailr with 3 lines of code. Just put what you want to run into __batch_main__ function.
Deploy services as you would deploy an application.
Jobs can be parameterized and created on the fly, as part of another, scheduled job
Customizable display; user settings saved between sessions
Flexible graph-based batch processing and orchestration
Automatically pre-warm machines
Automatically spin VMs up and down for enormous scale
Optimal memory and CPU allocationFailed jobs retried automatically until success
Optimal memory and CPU allocationFailed jobs retried automatically until success
Rerun only failed parts of batch runs
Full reproducibility to test your code: all batch jobs run the latest version of the image by default
When triggering a re-run of a previously executed batch, we can choose to run it with newer versions of the images
Batch jobs planning
Next run time is displayed for each scheduled job
When the expiration date is reached, batch job transitions to a 'stopped' state
Batch jobs visibility
Access to stdout and stderr of each job directly from the batch run view
Display each job's custom parameters
Gantt view of execution times
Every process is entitled in order to track and allocate costs by user, and set group limits
Customizable display; user settings saved between sessions
A fast and efficient Python build system to package and deploy applications
Seamless CI/CD Experience
Autobuild can be enabled for any package or image
Set it up once, and any push to the source repo will trigger a build of affected packages and images
Batch jobs use the latest version of an image, so you can always be sure what they're running with
Run automated user-defined tests against new image versions after they're built
Vectorized build system for packages and images
Build isolation using Docker
Caching individual build steps
ML packages are always available on the latest GPU hardware
Available to most key languages
Product Demo
The Ultimate SDLC
SDLC - Build Automation
Automatic based on git flow
Different rules can apply to different branches
Build multiple packages from one repo
Automatic update of all images and jobs that depend on a new version of a package
SDLC - Type Checker
Automatic type annotations out of the box
SDLC - Type Checker Dev Experience
Easily catch type errors
Type Checker Demos
Compare Datatailr package and image builder versus Conda
Feature
Conda
Datatailr package and image builder
System Package Integration
Requires duplication and maintenance of Linux distribution low-level packages
Packages integrate well with those provided by Linux system package manager; allows for greater focus
Dependency Management
Complex dependency graphs with version specifics. Longer build times and prone to errors (e.g., conda dependency resolution can fail after many hours of “building”)
Simplified, based on package names. Builds are typically 3x faster than with Conda-build and not prone to error.
Build Isolation
Relies on environment management
Utilizes Docker for complete isolation
Caching Individual Build Steps
Does not offer this level of granular caching
Repeated builds take advantage of Docker by caching individual build steps at Docker level, which can drastically cut down build times in CI/CD pipelines and avoid failed builds.
OS & Distribution Support
MacOS, Windows and Linux, excluding Alpine distribution due to lack of support for musl C runtime distributions
Linux only
Focus on Portability
Ecosystem ported high number of system packages, but at the expense of developer effort
Low, focuses on leveraging existing system resources