Top Announcements of AWS re:Invent 2022

Amazon Security Lake is a purpose-built service that automates the central management of security data sources into a purpose-built data lake stored in the account. This service helps security teams to analyze security data easily and have a complete understanding of the organization’s security posture. Security Lake has adopted the Open Cybersecurity Schema Framework (OCSF), an open standard that helps to normalize and combine the security data from various data sources including on-prem infrastructure, Firewalls, AWS CloudTrail, Amazon Route53, Amazon VPC Flow Logs, etc… Amazon Security Lake supports integrating data sources from third-party security solutions and custom data that has OCSF security data.

AWS Application Composer is a new AWS service that helps developers simplify and accelerate architecting, configuring, and building serverless applications. Users can visually compose serverless applications using AWS services with little guesswork. AWS Application Composer’s browser-based visual canvas supports the drag and drop of AWS services, establishing connectivity between them to form an application architecture comprising multiple AWS services. This service aids Developers in overcoming the challenges of configuring various AWS services and from writing IaC to deploying the application. AWS Application Composer maintains the visual representation of the application architecture in sync with the IaC, in real-time.

Amazon Inspector Now Scans AWS Lambda Functions for Vulnerabilities: Amazon Inspector, a vulnerability management service that continually scans workloads across Amazon Elastic Compute Cloud (Amazon EC2) instances & container images in Amazon Elastic Container Registry (Amazon ECR) now supports scanning AWS Lambda functions and Lambda layers. Customers who had to assess the lambda functions against common vulnerabilities had to use AWS and third-party tools. This increased the complexity of keeping all their workloads secure. As new vulnerabilities can appear at any time, it is very important for the security of your applications that the workloads are continuously monitored and rescanned in near real-time as new vulnerabilities are published.

AWS Clean Rooms: Helping companies bring in data from different environments, AWS Clean Rooms lets firms securely analyze and collaborate on data sets without sharing possibly insecure information, helping firms better understand their own customers and allow joint data analysis.

Amazon Redshift Streaming Ingestion with this new capability, Amazon Redshift can natively ingest hundreds of megabytes of data per second from Amazon Kinesis Data Streams and Amazon MSK into an Amazon Redshift materialized view and query it in seconds

Amazon Redshift integration for Apache Spark which makes it easy to build and run Spark applications on Amazon Redshift and Redshift Serverless, enabling customers to open up the data warehouse for a broader set of AWS analytics and machine learning (ML) solutions.

Amazon Athena for Apache Spark with this feature, we can run Apache Spark workloads, use Jupyter Notebook as the interface to perform data processing on Athena. this benefits customers in performing interactive data exploration to gain insights without the need to provision and maintain resources to run Apache Spark.

Create Point-to-Point Integrations Between Event Producers and Consumers with Amazon EventBridge Pipes In the modern event-driven application where multiple cloud services are used as building blocks, communication between the services requires integration code. Maintaining the integration code is a challenge. Amazon EventBridge Pipes is a new feature of Amazon EventBridge that makes it easier to build event-driven applications by providing a simple, consistent, and cost-effective way to create point-to-point integrations between event producers and consumers, removing the need to write undifferentiated glue code. Amazon EventBridge Pipes bring the most popular features of Amazon EventBridge Event Bus, such as event filtering, integration with more than 14 AWS services, and automatic delivery retries

Amazon DataZone is a new data management service that makes it faster and easier for customers to catalog, discover, share, and govern data stored across AWS, on-premises, and third-party sources. “To unlock the full power, the full value of data, we need to make it easy for the right people and applications to find, access, and share the right data when they need it — and to keep data safe and secure,” AWS CEO Adam Selipsky said on his keynote session. DataZone enables you to set data free throughout the organization safely by making it easy for admins and data stewards to manage and govern access to data. DataZone provides a data catalog accessible through a web portal where users within an organization can find data that can be used for analytics, business intelligence, and machine learning.

AWS Supply Chain is a new cloud-based application that helps supply chain leaders mitigate risks and lower costs to increase supply chain resilience. AWS Supply Chain unifies supply chain data, provides ML-powered actionable insights, and offers built-in contextual collaboration, all of which help you increase customer service levels by reducing stockouts and help you lower costs from overstock.

Support for Real-Time and Batch Inference in Amazon SageMaker Data Wrangler Deploy data preparation flows from SageMaker Data Wrangler for real-time and batch inference. This feature allows you to reuse the data transformation flow which you created in SageMaker Data Wrangler as a step in Amazon SageMaker inference pipelines.

SageMaker Data Wrangler support for real-time and batch inference speeds up your production deployment because there is no need to repeat the implementation of the data transformation flow.

You can now integrate SageMaker Data Wrangler with SageMaker inference. The same data transformation flows created with the easy-to-use, point-and-click interface of SageMaker Data Wrangler, containing operations such as Principal Component Analysis and one-hot encoding, will be used to process your data during inference. This means that you don’t have to rebuild the data pipeline for a real-time and batch inference application, and you can get to production faster.

Classifying and Extracting Mortgage Loan Data with Amazon Textract Until now, classification and extraction of data from mortgage loan application packages have been human-intensive tasks, although some lenders have used a hybrid approach, using technology such as Amazon Textract. However, customers told us that they needed even greater workflow automation to speed up automation efforts and reduce human error so that their staff could focus on higher-value tasks.

The new API also provides additional value-add services. It’s able to perform signature detection in terms of which documents have signatures and which don’t. It also provides a summary output of the documents in a mortgage application package and identifies select important documents such as bank statements and 1003 forms that would normally be present. The new workflow is powered by a collection of machine learning (ML) models. When a mortgage application package is uploaded, the workflow classifies the documents in the package before routing them to the right ML model, based on their classification, for data extraction.

Process PDFs, Word Documents, and Images with Amazon Comprehend for IDP with Amazon Comprehend for IDP, customers can process their semi-structured documents, such as PDFs, docx, PNG, JPG, or TIFF images, as well as plain-text documents, with a single API call. This new feature combines OCR and Amazon Comprehend’s existing natural language processing (NLP) capabilities to classify and extract entities from the documents. The custom document classification API allows you to organize documents into categories or classes, and the custom-named entity recognition API allows you to extract entities from documents like product codes or business-specific entities. For example, an insurance company can now process scanned customers’ claims with fewer API calls. Using the Amazon Comprehend entity recognition API, they can extract the customer number from the claims and use the custom classifier API to sort the claim into the different insurance categories—home, car, or personal.

Next Generation SageMaker Notebooks – Now with Built-in Data Preparation, Real-Time Collaboration, and Notebook Automation SageMaker Studio notebooks automatically generate key visualizations on top of Pandas data frames to help you understand data distribution and identify data quality issues, like missing values, invalid data, and outliers. You can also select the target column for ML models and generate ML-specific insights such as imbalanced class or high correlation columns. You then receive recommendations for data transformations to resolve the issues. You can apply the data transformations right in the UI, and SageMaker Studio notebooks automatically generate the corresponding transformation code in the notebook cells that you can use to replay your data preparation pipeline

SageMaker Studio now offers shared spaces that give data science and ML teams a workspace where they can read, edit, and run notebooks together in real time to streamline collaboration and communication during the development process. Shared spaces provide a shared Amazon EFS directory that you can utilize to share files within a shared space. All taggable SageMaker resources that you create in a shared space are automatically tagged to help you organize and have a filtered view of your ML resources, such as training jobs, experiments, and models, that are relevant to the business problem you work on in the space. This also helps you monitor costs and plan budgets using tools such as AWS Budgets and AWS Cost Explorer.

Schedule a meeting with our AWS cloud solution experts and accelerate your cloud journey with Idexcel.