As Amazon Web Services (AWS) continues releasing a multitude of products and resources, finding the right ones for your business can become a whole chore in and of itself. When it comes to machine learning (ML), there are now two options that might seem similar on the surface but are certainly not identical. Amazon SageMaker and Amazon ML both provide complete packages with various tools to create and deploy ML models while taking unique approaches to doing so.
The primary difference between the two lies in their target user bases. While Amazon ML’s high level of automation makes predictive analytics with ML accessible even for the layman, Amazon SageMaker’s openness to customized usage makes it a better fit for experienced data scientists.. However, even high-level professionals can appreciate some simplified automation to make their lives easier, so it may just depend on what the job is and how much time is available to get it done.
Let’s take a look at the specific characteristics of each to get an idea of which one is the best fit for you.
The Fundamentals of Amazon Machine Learning
ML technology as a whole provides developers with valuable tools for employing artificial intelligence rather than explicit programming to achieve improvement in computer performance. But this technology comes with a significant learning curve due to complex statistical techniques and algorithms. The traditional ML development process lacks integrated tools, making it necessary to pull from various products and workflows that aren’t made to work together. This eats into resources like time and money as well as the quality of the final results.
Using Amazon ML integrates the various parts of the complex ML process and all but erases the learning curve—or at least flattens it out a bit. It’s a cloud-based service that uses helpful wizards and visualization tools to guide developers of all levels through the ML model creation process. It then takes the models and obtains predictions using simple application programming interfaces (APIs). Throughout this process, you never need to manage complicated infrastructure or write custom code to generate the predictions; Amazon ML does it all for you.
When given data, Amazon ML chooses which methods to use automatically. It can load data from multiple sources such as CSV files, Amazon Redshift, and Amazon RDS from which it will identify numerical and categorical fields. This is followed by preprocessing such as dimensionality reducing and whitening, thus allowing for the creation of ML models that can generate predictions based on extracted patterns from the data. Predictions are easy to obtain using simple APIs without the need to implement custom prediction code or manage infrastructure.
With this automated process, you can easily go from models to prediction in a matter of seconds without taking the time to learn the details of ML methods. Despite the limited options for personalization, you’ll end up with high performance and scalable results.
How Amazon SageMaker Measures Up
Amazon SageMaker provides the same type of service for building and deploying ML models, but in a more progressive environment for advanced users. Rather than just providing a dataset for automated model building, the developer can pull from various tools within Amazon SageMaker to create their own processes.
This creation starts with spinning up a “notebook instance” to host the Jupyter Notebook application that holds all files, notebooks, and auxiliary scripts. Jupyter simplifies data analysis and exploration without the need to deal with the nuisance of server management, and many developers are already familiar with it from previous work in ML. Within the application, you create a new notebook to collect and prepare data, define your model, and begin the ML process.
Everything happens in one place using popular tools like Python as well as libraries available within Amazon SageMaker. SageMaker also supports some software out of the box such as Apache MXNet and Tensor Flow, as well as 10 built-in algorithms like XGBoost, PCA, and K-Means, to name just a few. And these algorithms are optimized on Amazon’s platform to deliver much higher performance than what they deliver running anywhere else.
However, you don’t need to limit yourself to the tools available in Amazon SageMaker. You can bring in your own favorite frameworks or work in a different programming language if you prefer that customization. Doing so won’t be as straightforward, but the option exists for those with particular methods and goals in mind. It just involves packaging any ML algorithm into a Docker container, followed by plugging it into the training-service pipeline.
After launching training with one click, Amazon SageMaker will spin up one or more “training instances” and upload all necessary data and scripts to run the training. The model is then ready for deployment to a production-ready cluster, and you can serve predictions via the HTTP API. Deployment in Amazon SageMaker includes fully-managed hosting as well as automatic petabyte scaling and accuracy tuning of models. Additionally, it includes built-in capabilities for A/B testing to experiment with different versions of models and find the best results.
Learn more in this Amazon Sagemaker Tutorial.
Putting The Two Head-to-Head
When deciding between the two services, it’s best to consider the skill of the user, timeline of the project, and desire for customization. A skilled data scientist is not only likely to be able to manage Amazon SageMaker, but may prefer it due to its more advanced capabilities and versatile process. Amazon SageMaker was specifically designed based on the idea of seamless adoption by professionals in the ML community, providing tools and features that allow easy adaptation for data scientists. And the powerful platform can do more than just get the job done—it can also allow an engineer to recreate themselves in the world of big data through the wide variety of development platforms.
Amazon ML, on the other hand, provides a simple process that is great for deadline-sensitive projects and developers who could use some ML guidance. It’s a more basic experience where a lot has been done for you. For example, it has a limit of three prediction capabilities options: regression, binary, and multiclass. Based on the classification, it will help you out by automatically picking an algorithm for generating predictions. This limit of three prediction capabilities options and automatic algorithm selection may feel like a weight lifted off the shoulders for some developers, but may feel restricting to others. For the latter group, Amazon SageMaker allows selection from 10 pre-loaded algorithms or creation of your own, granting much more freedom.
Amazon ML also restricts unsupervised learning methods, forcing the developer to select and label the target variable in any given training set. Compare this to Amazon SageMaker, where there are a slew of training algorithms including those provided by SageMaker, custom code, custom algorithms, or subscription algorithms from the AWS marketplace. In general, users of Amazon SageMaker can play a lot more with their personally preferred methods and datasets by leveraging its deployment features or integrating it with other machine learning libraries.
Aside from ease-of-use and versatility, a couple of practical factors to take into consideration are availability and price. Because Amazon SageMaker is newer than Amazon ML, it is not available everywhere. You can check if it’s available in your region online in the same place that you can find pricing. The region may affect prices, so don’t assume that a price you saw in one country will hold in a different one. But regarding a comparison between Amazon SageMaker and Amazon ML, the difference in price will likely come down to how much work you need to do and how much time it will take.
With this in mind, you should first and foremost consider the people who will use these tools as well as the projects that they need to complete. Both Amazon ML and Amazon SageMaker are solid options for companies already using cloud services from Amazon, but have distinct approaches catering to different preferences.