AWS re:Invent Recap: Amazon SageMaker Clarify

Amazon SageMaker Clarify

What happened?

AWS released Amazon SageMaker Clarify, a new tool for mitigating bias in machine learning model that helps customers more accurately and rapidly detect bias to build better solutions. This provides critical data and insights that increase transparency to help support analysis and explanation of model behavior to stakeholders and customers.

Why is it important?

  • Easily Detect Bias: SageMaker Clarify will help data scientists detect bias in data sets before training and their models after training.
  • Valuable Metrics & Statistics: It explains how feature values contribute to the predicted outcome, both for the model overall and for individual predictions.
  • Build Better Solutions: With the capability for developers to specify important model attributes, such as location, occupation, age, teams are better able to focus the set of algorithms in a sophisticated way to detect any presence of bias in those attributes. This enables teams to build the most accurate and effective solutions that drive client success.

Why are we excited?

With Amazon SageMaker Clarify, we can now better understand each feature in our ML models and give more detailed explanations to stakeholders. It provides transparency in model understanding that gives leadership more valuable information to inform critical business decision-making. SageMaker Clarify also includes feature importance graphs that explain model predictions and produce reports for presentations to better highlight any significant business impacts.

Availability

SageMaker Clarify is available in all regions where Amazon SageMaker is available. The tool will come free for all current users of Amazon SageMaker.

If you’re looking to explore these services further and need some guidance, let us know and we’ll connect you to an Idexcel expert!

AWS 2020 re:Invent Recap: AWS Trainium

What Happened: 

The newest AWS custom-designed chip, AWS Trainium, was announced during Andy Jassy’s 2020 re:Invent keynote, with the projected best price performance for training Machine Learning (ML) models in the cloud. Meant for deep learning training workloads for applications, it includes capabilities of image classification, semantic search, translation, voice recognition, NLP (Natural Language Processing), and recommendation engines.

Why It’s Important: 

  • Lower Costs: AWS Trainium instances are specifically targeted for reducing the training costs, in addition to the existing savings through using AWS Inferentia, which focuses on the inference factors of ML applications.
  • Easy Integration: Since it shares the same AWS Neuron SDK as AWS Inferentia, it’s easier for developers with Inferentia experience to start working with AWS Trainium. SDK integrates with popular ML frameworks like Pytorch, MXNet, and Tensorflow, making it easier for developers to move GPU instances to AWS with minimal code changes.
  • Greater Capabilities: This chip is optimized for training deep learning models for applications using images, text, and audio, which means more opportunities to build solutions that solve operational business challenges across industries.

Why We’re Excited

AWS Trainium will be the most sophisticated and advanced training technology leveraged to deliver elegant solutions to address customer challenges and project requirements. Since it integrates with and complements AWS Inferentia, ML training capabilities will significantly increase with optimized speed, skill, and efficiency. As the most cost-effective option with the broadest set of capabilities and robust AWS toolset to support it, end-to-end workflows can be created to scale AI/ML training workloads faster and bring products or services to market at an accelerated rate.

Availability:

AWS Trainium is available as an EC2 instance and will be available in Amazon Sagemaker in the 2nd half of 2021.

If you’re looking to explore these services further and need some guidance, let us know and we’ll connect you to an Idexcel expert!

AWS re:Invent Recap: SageMaker Data Wrangler

What happened?

The new service, SageMaker Data Wrangler, was announced during Andy Jessy’s 2020 re:Invent Keynote. Incorporated into AWS SageMaker, this tool simplifies the data preparation workflow so the entire process can be done from one central interface.

Why is it important?

  • SageMaker Data Wrangler contains over 300 built-in data transformations to normalize, transform, and combine features without having to write any code.
  • With SageMaker Data Wrangler’s visualization templates, transformations can be previewed and inspected in Amazon SageMaker Studio.
  • Data can be collected from multiple data sources and imported in one single go for data transformations.
  • Data can be in various file formats, such as CSV files, Parquet files, and database tables.
  • Data preparation workflow can be exported to a notebook or a code script for Amazon SageMaker pipeline or future use.

Why We’re Excited

SageMaker Data wrangler makes it easier for data scientists to prepare data for machine learning training using existing pre-loaded data preparation options. With preparation completed more quickly, our data science teams can accelerate the delivery of solutions to clients at a much faster pace.

If you’re looking to explore these services further and need some guidance, let us know and we’ll connect you to an Idexcel expert!

Amazon SageMaker in Machine Learning

Amazon SageMaker in machine Learning
Machine Learning (ML) has become the talk of the town, and its usage has grown inherent in virtually all spheres of the technology sector. As more applications are beginning to employ the use of ML in their functioning, there is a tremendous possible value for businesses. However, developers have still had to overcome many obstacles to harness the power of ML in their organizations.

Keeping the difficulty of deployment in mind many developers are turning to Amazon Web Services (AWS). Some of the challenges to processing include correctly collecting, cleaning, and formatting the available data. Once the dataset is available, it needs to be prepared, which is one of the most significant roadblocks. Post processing, there are many other procedures which need to be followed before the data can be utilized.

Why should developers use the AWS Sagemaker?
Developers need to visualize, transform, and prepare their data, before drawing insights from it. What’s incredible is that even simple models need a lot of power and time to train. From choosing the appropriate algorithm to tuning the parameters to measuring the accuracy of the model, everything requires plenty of resources and time in the long run.

With the use of AWS Sagemaker, data scientists provide easy to build, train and use Machine learning models, which don’t require extensive training knowledge for deployment. Being an end-to-end machine learning service, Amazon’s Sagemaker has enabled users to accelerate their machine learning efforts, thereby allowing them to set up and install production applications efficiently.

Bid farewell to heavy lifting along with guesswork, when it comes to using machine learning techniques. Amazon’s Sagemaker is trained to provide easy to handle pre-built development notebooks, while up-scaling popular machine learning algorithms aimed at handling petabyte-scale datasets. Sagemaker further simplifies the training process, which translates into shorter model tuning time. In the expressions of the AWS experts, the idea behind Sagemaker was to remove complexities, while allowing developers to use the concepts of Machine Learning more extensively and efficiently.

Visualize and Explore Stored Data
Being a fully managed environment, it’s easier for Sagemaker to help developers visualizer and explore stored data. The information can be modified with all of the available popular libraries, frameworks, and interfaces. Sagemaker has been designed to include the ten most commonly used algorithm structures, some of which include K-means clustering, linear regression, principal component analysis and factorization machines. All of these algorithms are designed to run ten times faster than their usual routines, allowing processing to reach more efficient speeds.

Increased Accessibility for Developers
Amazon SageMaker has been geared to make training all the more accessible. Developers can just select the quantity and the type of Amazon EC2 instances, along with the location of their data. Once the data processing process begins within Sagemaker, a distributed compute cluster is set up, along with the training, as the output is setup and directed towards Amazon S3. Amazon SageMaker is prepared to fine-tune models with a hyper-parameter optimization option, which helps adjust different combinations of algorithms, allowing the developers to arrive at the most precise predictions.

Faster One-Click Deployment
As mentioned before, Sagemaker takes care of all launching instances, which are used for setting up HTTPS end-points. This way, the application achieves high throughput with a combination of low latency predictions. At the same time, it auto-scales various Amazon EC2 instances across different availability zones (AZ) to accelerate the processing speeds and results. The main idea is to eliminate the need for heavy lifting within machine learning so that developers don’t have to indulge in elaborate coding and program development.

Conclusion
Amazon’s Sagemaker services are changing the way data is stored, processed, and trained. With a variety of algorithms in place, developers can wet their hands with the various concepts of Machine Learning, allowing themselves to understand what goes on behind the scenes. All this can be achieved without becoming too involved in algorithm preparations and logic creation. An ideal solution for companies looking forward to helping their developers focus on drawing more analytics from tons of data.

Related Stories

Overcoming Cloud Security Threats with AI and Machine Learning
aws reinvent 2017 product announcements
5 exciting new database services from aws reinvent 2017