Cloud Practitioner
There are sample questions somewhere Hands on is free to 12 euro (depending on service different payment schemes but everything pay as you go per use/time) Aws console is where you actually do stuff Make sure account is activated and choose free support plan
What is cloud computing
Client, server via network with ip addresses. Just like postboxes. Build up of a server: cpur, ram data and strucutred data (databases) Network our of cables routers switch and servers incl. dns
Mainting own data center(server cluster) pay rent, power, scaling is limited, people etc. solution: cloud
Cloud criteria from cloud computing lecture:
- on demand
- Broad network access
- Ressource sharing/pooling multi-tenancy
- Rapid elasticity and scaliblity
- Measured service pay as you go
Types of clouds:
- private (rackspace? Proxmox?)
- public aws azure, gc
- Hybrid some local some cloud.
When using a cloud u Trade capital expense capex for operational expense opex
Dont need to guess capacity, save on having to maintain a data center, high availability and fault tolerance
Types of cloud computing:
- IaaS, provide infrastructure highest level of flexibility (EC2, digital ocean)
- PaaS, dont need to manage infrastructure just run app, elastic beanstalk
- SaaS, just work no managment. Face rocgnition on aws or gmail
Show different levels the 3 + on premise of what has to be managed.
Aws has map on infrastructure.aws:
- regions
- availability zones
- data centers
- edge locations/points of presence
Regions have 3-6 zones with each 1 or more datacenters. Each zone has redudant power, network and connectivity. Availability zones are isolated from each other for disasters. And are connected with high bandwidth, low-latency network within region
A bit unclear of meaning of edge locations?
How to choose a region:
- compliance, data governance and legal requirements
- Proximity, reduce latency
- Availiability, not all services are available in all regions
- Pricing
Search at the top is very useful as also has docs and tutorials
Some services are global and have the same view no matter which region (can be seen in the top right)
Can list services by region (link todo)
Shared Responsibility Model
Shared responsibility diagram shows who is responsible for what (u or aws) how u configure services are ur responsibility if u configure shit security ur fault. Aws is responsible for security on software and hardware level.
IAM - Identity and access management
Directly vs inline? Is it the same?
The first service to look at is IAM, short for Identity and Access Management. This is the service that allows you to manage users and their level of access to the AWS Management Console. Because it is one of the most important services and is used to control access to all other services, it is a global service, meaning it is available in all regions.
To remember the acronym, think of yourself saying "I am" the person who is going to manage the users and their permissions.
When you first create an AWS account, you have created a root account. This account has complete access to all AWS services and can therfore also rack up a huge bill. It is not recommended to use the root account for everyday tasks, or to share it with others. Instead, you should create use the least privilege principle and create an IAM user for yourself and give it the necessary permissions. This is also a best practice for security reasons.
In AWS permissions are managed using policies. A policy is a document that defines permissions. It is written in JSON format and consists of a version, an identifier and a statement.
For example, the following policy AdminstratorcAccess
allows all actions on all resources:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "*",
"Resource": "*"
}
]
}
The Effect
can be either Allow
or Deny
. The Action
is the action that is allowed or denied on a Resource
.
Actions are API calls that allow you to interact with AWS services such as read, write delete etc. The Resource
is the resource that the policy applies to such as an S3 bucket or an EC2 instance.
The *
is a wildcard that matches all actions or resources.
When creating a new user, you can assign them permissions by attaching policies to them. You can also create groups and assign policies to the group and then add users to the group. You can also create an alias for the sign-in link, which can be useful if you are working with multiple accounts, for example for different projects.
In AWS users, groups, roles etc. are also reffered to as identities.
Types of Policies
For a clear overview I suggest you look at the AWS documentation (opens in a new tab) on the topic.
However, in short there are three types of policies:
- Managed Policies: These are standalone policies that you can attach to multiple identities. They are maintained by AWS and are the recommended way to assign permissions.
This is for example the
AdministratorAccess
policy that allows full access to all AWS services or theIAMFullAccess
policy that allows full access to IAM. - Customer Managed Policies: These are policies that you create and manage yourself. You can attach them to multiple identities. This is when you define a custom policy via JSON or the visual editor. These policies are stored in your account and are reusable.
- Inline Policies: These are basically customer managed policies that are embedded directly into a single identity and are then also deleted when the identity is deleted.
Keeping Users Secure
Can be found under Account settings and then password policy.
When creating users for others there are some password policies that you can set to ensure that the passwords are secure:
- Minimum password length
- Require specific character types
- Allow users to change their own password
- Require password change on first login
- Password expiration and password reuse prevention
The most secure way to access the AWS Management Console is by using Multi-Factor Authentication (MFA).
MFA = something you know (password) and something you have (token on physical device)
Virtual MFA devices like Google Authenticator or Authy Universal 2nd Factor (U2F) security key like YubiKey Hardware Key Fob MFA devices
To activate MFA its at the top right of the console under Security Credentials. You can then choose to activate MFA for the root account. For IAM users you can activate the same MFA or enforce it for all users.
How to access AWS
- AWS Management Console: Web-based user interface that you can use to access and manage AWS services.
- AWS Command Line Interface (CLI): Command line tool that allows you to control multiple AWS services from the command line and automate them through scripts.
- AWS Software Development Kits (SDKs): Libraries or APIs that allow you to interact with AWS services from your preferred programming language.
Via access keys do not share. Can be used for CLI or SDKs. Can be rotated.
To configure CLI you can use aws configure
and enter your access key id , secret access key, region and output format (just enter for default).
aws iam list-users to list users very useful
What is cloudshell? is basically CLI in the management console, recommended to use it for CLI commands over local installation. can alos download and upload files to it, seems very useful. Not available in all regions.
Roles
In AWS some services might want to perform actions. For example an EC2 instance might want to access an S3 bucket. To do this you can use roles which are similar to users but are assigned to AWS services.
Security Tools
Can use Credentials Report to see all users and the status of their credentials. Account-level Can use IAM Access Advisor shows the services that a user has accessed and the last time they accessed them to align permissions with actual usage (least privilege principle). User-level
Billing and Cost Management
We can view usage and billing information in the Billing and Cost Management Dashboard. Here we can also set up budgets and alerts to notify us when we are exceeding our budget. "charges by service" is useful to see what is costing the most. Zero cost budget or montly budget can be set up etc.
Ec2 - elastic compute cloud
One of the key compute services in AWS is EC2, short for Elastic Compute Cloud.
IaaS
Mainly consist of:
- EC2, renting virtual machines
- EBS, virtual block storage for EC2 (what is block storage?)
- ELB, load balancer
- ASG, auto scaling group
Configuration:
- OS, linux, mac or windows
- cpu and ram
- storage, network attached storage with EBS or EFS or hardware storage with instance store
- network card for speed, security group for firewall, public ip, private ip, dns
- bootstrap script for configuration, ec2 user data for startup script, only runs once when instance is launched (restart?) installing updates, softeare
- ec2 user data script runs as root
show example of ec2 instances
to ssh into ec2 create a key pair and use ssh -i key.pem ec2-user@ip in the security group allow port 22 for ssh and 80 for http as will start a Web server ebs volumes can be configured like deletion on termination, encryption, snapshotting etc.
at the bottom of the advanced section of the ec2 instance creation there is a user data section where you can add a bash script that will run on startup. This can be used to install software, configure the instance etc. Why is it called user data?
After starting and stopping an instance the public ip will change. To have a static ip you can use an elastic ip.
There are different types of instances like general purpose (balanced), compute optimized, memory optimized and storage optimized.
m5.2xlarge, m = instance class 5 = generation 2xlarge = size within the instance class
ec2instances.info is a useful website to compare instances
Security Groups
Fundamental firewall rules for your EC2 instances. They control the inbound and outbound traffic to your instances.
regualte:
- access to ports
- authorise ip ranges, also if ipv4 or ipv6
- control of inbound network (ingress) and outbound network (egress) traffic
By default all inbound traffic is blocked and all outbound traffic is allowed.
many to many relationship between security groups and instances.
If a timeout occurs when trying to connect to an instance it is likely a security group issue. for example if you are trying to connect to an instance via ssh and the security group does not allow port 22.
If you get connection refused it is likely an issue with the instance itself.
Can authorize specific security groups to allow traffic between instances in the same security group without specifying ip addresses.
Most important ports:
- 22 for ssh
- 21 for ftp, unencrypted
- 22 SFTP, encrypted ftp over ssh
- 80 for http unencrypted
- 443 for https encrypted
- 3389 for rdp remote desktop protocol, windows instances
0.0.0.0/0 means all ip addresses
ec2 instance connect is a new feature that allows you to connect to an instance without manually using ssh.
Go into the instance and click connect, then connect with ec2 instance connect. Will temporaily create a key pair and connect to the instance. Port 22 must be open in the security group.
Never cofnigure of enter iam credentials into an instance. Use roles instead.
purchsing options:
- on demand, pay as you go
- reserved instances, 1-3 years, cheaper up to 70%, upfront, partial upfront or no upfront
- convertible reserved instances, change instance type, same but less discount
- savings plans, commit to a certain amount of usage 10 per hour for 1 year, cheaper than on demand excess is on-demand
- spot instances, bid for unused capacity, cheap but can be terminated at any time, less reliable up to 90% cheaper, not suitable for critical jobs
- dedicated hosts, physical server for you, expensive, complicance requirements or licensing like per core etc.
- dedicated instances, no other customers on the same hardware, expensive
- capacity reservations, reserve capacity for specific instance type in a specific availability zone, expensive, are assured capacity, billed on-demand even if don't use
there is a table for between dedicated host vs dedicated instance
EBS - Elastic Block Store
Block storage vs object storage?
"network usb stick"
storage options for ec2 instances. network drive that can be attached to an ec2 instance. persist data even if instance is terminated. Not a physical drive, more latency than local storage. can only be attached to one instance at a time and are bound to a specific availability zone.
to move ebs between AZs you can create a snapshot and then create a new volume from the snapshot in the new AZ. Fixed size but can be increased over time.
can set "delete on termination" to false to keep the volume when the instance is terminated. by default the root volume is deleted on termination and additional volumes are kept.
EBS Snapshots are backups of your EBS volumes. recommend to first detach the volume before creating a snapshot (make it clean). using snapshots you can create new volumes, move volumes between AZs or regions, share snapshots with other accounts etc. snapshots can be copied to other regions for disaster recovery.
can be moved to archive which is cheaper but takes longer to restore 24 hours to 72 hours.
you can setup rules for when snapshots are deleted to go into a recycle bin for 7 days before being permanently deleted. "retention rules"
AMI - Amazon Machine Image
powers ec2 instances. like software configuration, os, application server, applications etc. This allows for faster boot times and consistency across instances as everything preconfigured and packeged.
a public ami is for example the amazon linux 2 ami. you can also create your own ami from an existing instance. think of docker images. Create own ami or use marketplace ami can potentially save time but also cost and security concerns.
AMIs are built from a ec2 instance. Ideally you should stop the instance before creating the AMI to ensure that the file system is in a consistent state. This creation process will also create a snapshot of the EBS volumes attached to the instance.
EC2 Image Builder
automate the creation of VMs or container images. i.e automate the creation, maintain, validate and test AMIs.
Starts a builder ec2 instance where you can install software, configure etc. and then create an image from that. Then an instance is launched from the image and the image is tested with tests that you define. if they pass then published.
This can be setup on a schedule to ensure that the images are always up to date.
EC2 Instance Store
certain instance types come with instance store volumes which are physically attached, so better performance but data is lost when the instance is stopped or terminated (ephemeral storage). good for like caches, buffers, scratch data etc. risk of data loss, so not recommended for critical data.
EFS - Elastic File System
Managed network file system that can be shared across multiple ec2 instances. Think of it as a network usb stick like the AD. highly available, scalable, expensive, pay for what you use no capacity planning. instances can be across different AZs.
EFS-IA is infrequent access storage class that is cheaper but has a retrieval fee. for data that is not accessed often. will automatically move files that are not accessed often to the infrequent access storage class based on a lifecycle policy. once you access the file it will be moved back to the standard storage class.
FSx
use 3rd party high performance file systems like windows file server or lustre for high performance computing. native and supported for windows via smb and integrated with Microsoft Active Directory for user authentication.
Lustre is Linux and cluster file system for high performance computing. for ML and analytics workloads.
ELS and ASG
ELS is an elastic load balancer that distributes incoming traffic across multiple targets like ec2 instances. ASG is an auto scaling group that automatically adjusts the number of ec2 instances in response to demand.
Scalability = ability to handle increased load, either:
- vertically, for call center instead of junior operator hire senior operator, common for databases
- horizontally (elasticity), for call center just get more operators, implies a distributed system, common for web servers
- elasticity = ability to automatically increase or decrease capacity based on demand, auto-scaling pay per use
- agility = agile development, fast deployment, fast changes, fast feedback
high availability = ability to stay up and running, run in at least 2 availability zones to survivie a disaster.
both require load balancing and auto scaling. high availability also wants multi AZ mode for both.
load balancer forward internet traffic to multiple ec2 instances downstream. load balancer can be internal or external, i.e internet facing or not. single point of acces, spread load, high availability, fault tolerance, health checks for downstream instances.
EBS = managed load balancer, meaning aws manages the load balancer for you makes sure is up and running, scales automatically, no need to worry about maintenance. Can setup custom load balancer which is cheaper but more work.
4 types of load balancers:
- Application Load Balancer (ALB), http and https grpc, layer 7, http routing features, static dns/url
- Network Load Balancer (NLB), TCP, high performance, layer 4, static ip through elastic ip
- Gateway Load Balancer, layer 3, GENEVE protocol on ip packets, route traffic to firewalls managed by ec2 instances for intrusion detection etc. traffic is checked and insprected via GWLB.
- Classic Load Balancer, retired 2023, layer 4 and 7
traffic is routed to "target groups" which depend on the protocol and port. can have multiple target groups per load balancer. and each target group can have multiple targets.
ASG for auto scaling, like shopping during the day but not at night. can scale out (add instances) or scale in (remove instances) based on demand. Can setup min/desired/max, they can also be registerd to a target group. the ASG will also replace unhealthy instances because of health checks.
launch templates are instructions for the ASG on how to launch instances. can be used to launch instances with specific configurations like ami, instance type, key pair etc.
scaling strategies:
- manual, not recommended, update the desired capacity manually
- dynamic, either simple or step scaling, based on cloudwatch alarms based on a trigger like cpu usage > 70% for 5 minutes add or under 30% for 5 minutes remove instances?? target tracking scaling, scale out or in to keep a metric at a specific value like cpu usage at 70%. lastly scheduled scaling, scale out or in based on a schedule like every saturday scale out for sports betting website.
- predictive scaling, machine learning to predict future demand based on historical data. will provision instances in advance for easy pattern recognition.
S3 - Simple Storage Service
many use cases like backup and storage, disaster recovery, archiving, hybrid cloud storage, data lakes, static websites etc.
store objects (files) in buckets (folders). buckets must have a globally unique name across all regions and accounts. buckets is a global service but are stored in a region.
objects have a key (full path) prefix + object name. Seems like a folder but doesn't actually exist. object values are the content. max size of an object is 5TB and can store any type of file like images, videos, backups, logs etc. When uploading more then 5GB must use "multipart upload", X/5GB parts. object metadata, list of text key value pairs like content type from system or user. object tags unicode key value pairs by user for organization, cost allocation etc. max 10 tags per object. version id for versioning if versioning is enabled. can be used to restore previous versions of an object.
Security
- user based, IAM policies, sets which api calls a user can make
- resource based, bucket policies, bucket wide rules such as public access, cross account access etc.
- Object Access Control List (ACL), fine grained control
- Bucket Access Control List (ACL), less common
objects can be encrypted with encryption keys
Bucket policies, that allows anyone to read, i.e public read access. principal are the users that are allowed to do the action, therefor * is everyone.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "PublicRead",
"Effect": "Allow",
"Principal": "*",
"Action": "s3:GetObject",
"Resource": "arn:aws:s3:::mybucketname/*"
}
]
}
You can set block "all public access" to true to prevent public access to the bucket even if there is a bucket policy that allows it.