AWS
AWS Services
- AWS AI Service Cards: documentation for AWS AI services
- AWS Artifact: provides on-demand access to security and compliance documentation for the AWS Cloud
- AWS CloudTrail: logs AWS service activity (including API calls)
- AWS Config: overview of AWS resource configurations
- AWS DeepRacer: teaches reinforcement learning in an interactive, virtual environment
- AWS Glue DataBrew: visual data preparation tool
- AWS Glue: integrate data from multiple sources
- AWS Identity and Access Management (IAM): control access to AWS resources
- AWS Inferentia: accelerator that delivers high performance in Amazon EC2
- AWS Key Management Service (KMS): store and manage cryptographic keys
- AWS Lambda: serverless compute service
- AWS Secrets Manager: manage and maintain credentials
- AWS Trusted Advisor: provide recommendations to improve account (e.g. cost savings)
Amazon Services
- Amazon Augmented AI (A2I): human review of ML predictions
- Amazon Bedrock Agents: autonomous agents to orchestrate interactions between FMs, data sources etc.
- Amazon Bedrock Guardrails: manage generative AI applications (filter topics, add safeguard to models)
- Amazon Bedrock: provides access to FMs (ready to use, limited customization)
- Amazon CloudWatch: logs for applications on AWS (NO API calls)
- Amazon Comprehend: extracts documents insights (key phrases, sentiment)
- Amazon DocumentDB: fully managed, MongoDB compatible, JSON document database, supports real-time vector search with low latency
- Amazon DynamoDB: NoSQL key-value database
- Amazon EC2: secure, resizable computational service
- Amazon Elastic Container Registry (ECR): fully managed repository for container images
- Amazon Inspector: vulnerability management service
- Amazon Kendra: semantic & contextual search
- Amazon Lex: build chatbots
- Amazon Macie: discover, monitor, and protect sensitive data in Amazon S3
- Amazon Mechanical Turk (MTurk): allow businesses to hire remote workers for manual tasks (e.g. data labeling)
- Amazon OpenSearch Service: fully managed service for OpenSearch on AWS with vector store & similarity search
- Amazon Personalize: generates user recommendations
- Amazon Polly: convert text to speech
- Amazon Q Business: fully managed AI-assistant
- Amazon Q Developer: AI assistant for AWS applications
- Amazon Q: generative AI assistant to answer business questions
- Amazon QuickSight: BI and reporting tool to build reports and dashboards
- Amazon RDS for Oracle: relational database with Oracle support
- Amazon Redshift: fully managed SQL database
- Amazon Rekognition: analyze visual content (image and video)
- Amazon S3: object storage service
- Amazon SageMaker Autopilot: automates ML model creation, training, and tuning
- Amazon SageMaker Canvas: build ML models without writing code
- Amazon SageMaker Clarify: explain model responses, detect bias
- Amazon SageMaker Data Wrangler: import, cleanse, analyze, and transform data
- Amazon SageMaker Feature Store: create, store, share, and manage features
- Amazon SageMaker Ground Truth: label data for ML model training
- Amazon SageMaker JumpStart: provides access to FMs (more control, need deployment)
- Amazon SageMaker Model Cards: documentation for models (intended use, evaluation metrics)
- Amazon SageMaker Model Monitor: alerts when model quality changes (data drift)
- Amazon SageMaker Model Registry: fully managed catalog for ML models
- Amazon SageMaker Pipelines: create MLOps workflows
- Amazon SageMaker Studio: suite of IDEs (RStudio, VSCode)
- Amazon SageMaker: collection of tools to train and deploy ML models on AWS
- Amazon Textract: extract text from images and PDFs
- Amazon Titan: family of FMs in Amazon Bedrock
- Amazon Transcribe: convert speech to text
- Amazon Translate: language translation service
- PartyRock: Amazon Bedrock Playground to build AI-generated apps
Terms & Definitions
- accuracy: %ge of correct predictions (ratio of true positives (TP) and true negatives (TN) to the total predictions)
- anomaly detection: unsupervised learning algorithm to identify anomalies (Random Cut Forest - RCF)
- asynchronous inference: useful for large data without immediate response
- attention mechanism: help the model focus on relevant parts of the input when generating text
- BERTScore: measure similarity between chatbot and human responses
- bias: low bias = model is not making erroneous assumptions about the training data
- chain of thought: breaks down a complex question into smaller parts
- classification: supervised learning algorithm to categorize data (binary/multiclass/image)
- clustering: unsupervised ML method to group similar objects (k-means)
- computer vision: field of AI to interpret and understand visual data
- context window: # tokens model can accept in the context
- continued pre-training: provide unlabeled data to FM to improve domain knowledge
- cross-validation: data preparation technique for model training
- diffusion model: AI model to generate image from prompt
- domain adaptation fine-tuning: fine-tune FM with domain-specific information
- embeddings: numerical representations of words
- explainability: ability to understand how a model arrives at a prediction
- extract, transform, load (ETL): combines data from multiple sources into single data set
- F1 score: metric to evaluate classification models: 2 * precision * recall / (precision + recall)
- fairness: impartial and just treatment without discrimination
- feature engineering: selecting and transforming data model training
- few-shot learning: make predictions using a few examples in prompt
- fine-tuning: improves model’s performance using labeled data
- forecasting: forecast 1D time series data (DeepAR)
- foundational model (FM): broader then LLM, can handle various data types
- gateway endpoint: secure connection from a VPC to Amazon S3/DynamoDB
- generative AI security scoping matrix: framework to classify generative AI use cases (ownership)
- generative AI: AI to create new content (images, text, music, or conversations)
- hallucination: false information generated by LLM
- image_uri: Docker image URI in Amazon SageMaker AI
- in-context learning: add instructions & examples inside prompt
- inference_instances: list of inference instances for a model deployed in Amazon SageMaker AI
- inference: model prediction
- instruction-based fine-tuning: use labeled examples (prompt, response pairs) to improve FM on a specific task
- knowledge cutoff: data limitation due to LLMs pre-trained on static datasets
- large language model (LLM): focus on text & language tasks
- machine learning (ML): train models to make predictions based on existing data
- mean absolute error (MAE): mean of the absolute differences between the actual values and the predicted values
- mean absolute percentage error (MAPE): mean of the absolute differences between the actual values and the predicted values, divided by the actual values
- mean squared error (MSE): squared difference between the predicted and actual values
- multimodal model: can understand data from multiple modalities
- natural language processing (NLP): ML technology to interpret, manipulate, and comprehend human language
- overfitting: good model performance on training data but not on new data
- perplexity: probability of a model to generate a given sequence of words
- personally identifiable information (PII): data that can identify an individual
- precision (positive predicted value PPV): TP / (TP + FP)
- prompt engineering: optimize FM inputs to generate better responses
- prompt injection attacks: ignoring the prompt template, exploiting friendliness, prompting persona switches
- prompt template: predefined format to standardize inputs and outputs
- recall (true positive rate TPR): TP / (TP + FN)
- regression: predict output based on input
- reinforcement learning from human feedback (RLHF): use human feedback to train ML model
- retrieval augmented generation (RAG): technique to let LLM use information from external knowledge base
- ROUGE-N: measure similarity between generated and reference summary
- S3 bucket policy: grants access to S3 data
- semi-structured data: e.g. nested .json file, hierarchically organized .xml file
- semi-supervised learning: combines a small amount of labeled data with a large amount of unlabeled data. Semi-supervised learning is useful to train models to recognize different driving scenarios that might not be fully labeled.
- serverless inference: useful for near-real time requests with idle periods & cold starts
- single-shot prompt engineering: provide single example
- specificity (true negative rate TNR): TN / (TN + FP)
- stop sequences: inference parameter to interrupt text generation
- structured data: tabular with rows and columns (e.g. csv)
- supervised learning: train models on labeled data set
- temperature: control randomnes of response (higher temperature -> more random)
- token: sequence of characters used by model to predict single unit of meaning
- tokenization: splitting input text into individual words or subword units
- top K: # of most-likely candidates considered for next token.
- top P: %ge of most-likely candidates considered for next token
- underfitting: model too simple to capture underlying data patterns
- unstructured data: plain text file with no order
- unsupervised learning: detect anomalies or unusual patterns
- variance: high variance = model is paying attention to noise in the training data and is overfitting
- vector database: efficiently store and manage high-dimensional data
- zero-shot learning: make predictions without examples
