Glen Knight

NYC Based IT Professional

Step Functions Distributed Map – A Serverless Solution for Large-Scale Parallel Data Processing

I am excited to announce the availability of a distributed map for AWS Step Functions. This flow extends support for orchestrating large-scale parallel workloads such as the on-demand processing of semi-structured data. Step Function’s map state executes the same processing steps for multiple entries in a dataset. The existing map state is limited to 40 […]

0 Comments

Read More

AWS Machine Learning University New Educator Enablement Program to Build Diverse Talent for ML/AI Jobs

AWS Machine Learning University is now providing a free educator enablement program. This program provides faculty at community colleges, minority-serving institutions (MSIs), and historically Black colleges and universities (HBCUs) with the skills and resources to teach data analytics, artificial intelligence (AI), and machine learning (ML) concepts to build a diverse pipeline for in-demand jobs of […]

0 Comments

Read More

New for Amazon Redshift – Simplify Data Ingestion and Make Your Data Warehouse More Secure and Reliable

When we talk with customers, we hear that they want to be able to harness insights from data in order to make timely, impactful, and actionable business decisions. A common pattern with data-driven organizations is that they have many different data sources they need to ingest into their analytics systems. This requires them to build […]

0 Comments

Read More

New — Introducing Support for Real-Time and Batch Inference in Amazon SageMaker Data Wrangler

To build machine learning models, machine learning engineers need to develop a data transformation pipeline to prepare the data. The process of designing this pipeline is time-consuming and requires a cross-team collaboration between machine learning engineers, data engineers, and data scientists to implement the data preparation pipeline into a production environment. The main objective of […]

0 Comments

Read More

New — Amazon SageMaker Data Wrangler Supports SaaS Applications as Data Sources

Data fuels machine learning. In machine learning, data preparation is the process of transforming raw data into a format that is suitable for further processing and analysis. The common process for data preparation starts with collecting data, then cleaning it, labeling it, and finally validating and visualizing it. Getting the data right with high quality […]

0 Comments

Read More

Announcing Additional Data Connectors for Amazon AppFlow

Gathering insights from data is a more effective process if that data isn’t fragmented across multiple systems and data stores, whether on premises or in the cloud. Amazon AppFlow provides bidirectional data integration between on-premises systems and applications, SaaS applications, and AWS services. It helps customers break down data silos using a low- or no-code, […]

0 Comments

Read More

New – Amazon EC2 Hpc6id Instances Optimized for High Performance Computing

We have given you the flexibility and ability to run the largest and most complex high performance computing (HPC) workloads with Amazon Elastic Compute Cloud (Amazon EC2) instances that feature enhanced networking like C5n, C6gn, R5n, M5n, and our recently launched HPC instances Hpc6a. We heard feedback from customers asking us to deliver more options to support […]

0 Comments

Read More

Preview: Amazon Security Lake – A Purpose-Built Customer-Owned Data Lake Service

To identify potential security threats and vulnerabilities, customers should enable logging across their various resources and centralize these logs for easy access and use within analytics tools. Some of these data sources include logs from on-premises infrastructure, firewalls, and endpoint security solutions, and when utilizing the cloud, services such as Amazon Route 53, AWS CloudTrail, […]

0 Comments

Read More

New – Amazon Redshift Integration with Apache Spark

Apache Spark is an open-source, distributed processing system commonly used for big data workloads. Spark application developers working in Amazon EMR, Amazon SageMaker, and AWS Glue often use third-party Apache Spark connectors that allow them to read and write the data with Amazon Redshift. These third-party connectors are not regularly maintained, supported, or tested with […]

0 Comments

Read More

Preview: Amazon OpenSearch Serverless – Run Search and Analytics Workloads without Managing Clusters

Most AWS analytics services have compelling serverless offerings that make it even easier for customers to analyze vast amounts of data without having to configure, scale, or manage the underlying infrastructure. Along with other serverless analytics, such as Amazon QuickSight for business intelligence and AWS Glue for data integration, we have introduced Amazon EMR Serverless, […]

0 Comments

Read More