The following are my takeaways:

Link to Udacity course : Networking for Developers

Ping to HTTP

  • ping does not have a server that gives back the ack. ping receives the request by the operating system. The operating system sends the acknowledgement. You can ping any operating system
  • output request and response
1
printf 'HEAD / HTTP/1.1\r\nHost: en.wikipedia.org\r\n\r\n' | nc en.wikipedia.org 80
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
HTTP/1.1 301 TLS Redirect
Date: Thu, 26 Dec 2019 01:41:48 GMT
Server: Varnish
X-Varnish: 817889376
X-Cache: cp5011 int
X-Cache-Status: int-front
Server-Timing: cache;desc="int-front"
Set-Cookie: WMF-Last-Access=26-Dec-2019;Path=/;HttpOnly;secure;Expires=Mon, 27 Jan 2020 00:00:00 GMT
Set-Cookie: WMF-Last-Access-Global=26-Dec-2019;Path=/;Domain=.wikipedia.org;HttpOnly;secure;Expires=Mon, 27 Jan 2020 00:00:00 GMT
X-Client-IP: 13.229.230.30
Location: https://en.wikipedia.org/
Content-Length: 0
Connection: keep-alive
  • nc connecting to a port and sending a string. Since the string looks like a HTTP request, it responds the request
  • nc stands for netcat
  • nc does not know anything about HTTP server
  • nc is used to connect various machines
  • text of http request piped to nc
  • nc is a thin wrapper over TCP
  • Transport protocols -
  • Application protocols -
  • Protocols such as HTTP, SSH are are implemented by Application layer - Things that make sense to applications such as browsers
  • Protocols such as TCP, UDP are implemented by Transport layer
  • Protocols such as IP are implemented by Internet layer
  • wifi, ethernet and DSL are implemented by Hardware
  • IP - Narrow waste of
  • nc -l 1234 listens to the port
  • Different programs running and listening
  • All the other ports where computer
  • Listening on a port is a simple way of being a server
  • nc is plain TCP server. TCP is a two way route. It is possible to send messages to each other
  • CTRL+D - end of input
  • Server has a well known port for applications
  • Client initiates a connection and can use a different
  • There is a largest port that we can listen to - “Servname not supported for aisocktype
  • Highest port 65535 - Not an arbitrary limit
  • 0 - 1023 are reserved for super user. You have to do a sudo
  • If you want to listen on 80, you need to be a super user
  • Only one program can listen on a given port
  • Once a program starts, then it can create separate threads to listen to various ports. nc does not have this capability. Webserver has this capability as it spaws several processes
  • sudo lsof -i listens over 6011 and 6012
  • browser works on several requests - html + images + css -the more it can do this parallel, the faster it can respond
  • whenever a new connection, a new child process is spawned and caters to the new request. There is a limit of the number of child processes that can be spawned

DNS - names and addresses

  • Every packet has a Destination IP
  • No single host for DNS name
  • Bunch of Ip addresses for a single DNS name for load balancing
  • Domain Naming System - DNS to IP
    • DNS A Record : Maps a name to IPV4 address
    • One has to create DNS records
    • Register at Registrar and create DNS records
    • If DNS goes down, then the site cannot be reachable
  • DNS resolver is built in to every operating system
  • host is built in to OS. It gives the
1
2
3
4
5
host -ta www.refinitiv.com
www.refinitiv.com is an alias for d347ymu6kosx4n.cloudfront.net.
d347ymu6kosx4n.cloudfront.net has address 54.192.151.13
d347ymu6kosx4n.cloudfront.net has address 54.192.151.44
d347ymu6kosx4n.cloudfront.net has address 54.192.151.46
  • dig can be used to get a lot more information about pings
  • Many types of DNS records - A address,
  • DNS is a distributed directory. No one DNS server needs to know all the DNS
    • top level domains
    • records for a certain domain will be found in authoritative name servers
    • NS records are listed for higher level servers
  • GTLD - Global level
  • Resolvers talk to nearby cache user
  • Caching server - consults the local cache - recursively resolves the query and then gives the IP
  • DNS records have TTL - time to live - If you look up the cache again, the TTL is reduced
  • Apache - Virtual Host configuration
  • Host header required part of HTTP header
  • DNS are structured as trees
    • Domains
    • SubDomain
    • SubDomain of a SubDomain
    • www.refinitiv.com is a subdomain of refinitiv.com
  • Institutions - webserver - Single machine representing institution
  • So many domains - Skip the www and point the domain
  • Whether to use bare domains or sub domains is style and branding preference
  • Fully Qualified Domain Names - need the FQDN
  • Apache webserver
    • By setting up a domain
    • Setting up a
  • Approximately cost 12 usd for a domain name
  • Each packet contains the ip of sender and receiver - 4 octets - makes it easy - 32 bits IP address
    • plain decimal is difficult to read
  • IPV4 has 32 bits to represent the destination or target ip address
  • IPV6 has 64 bits - With a given number of bits, you can make only certain distinct number of values
  • Highest port is 65535
  • You cannot listen
  • 16 bits - two octets - max number 65535 - Port numbers are 16 bit values or 32 bit values

Addresses and Networks

  • IPV4 is a four octet value - Mostly a convention. It is just a 32 bit string
  • Some of the addresses are reserved for internal networks
  • img
  • if all the 32 bit are used to represent public hosts, that would not be enough
  • the light-green squares (0, 10, and 127) are blocks that are entirely reserved.
  • the dark-green squares are blocks that are partly reserved. for instance, not all of the 192 block is reserved, but some of it is.
  • the entire cyan row (starting at 224) is set aside for ip multicast.
  • And the entire orange bottom row (starting at 240) was originally set aside for “future use” but was effectively lost due to being blocked as invalid. No, really. We lost 1/16th of all IPv4 addresses due to mistaken planning.
  • Not every number can be assigned to public host - More than a billion IPV4 addresses
  • Most of the IP addresses in a network block
  • Not all networks are of the same size
  • Network prefix are shorter
  • /8 network has a set number of public hosts
  • CIDR is the short for Classless Inter-Domain Routing, an IP addressing scheme that replaces the older system based on classes A, B, and C. A single IP address can be used to designate many unique IP addresses with CIDR. A CIDR IP address looks like a normal IP address except that it ends with a slash followed by a number, called the IP network prefix. CIDR addresses reduce the size of routing tables and make more IP addresses available within organizations.
  • Split the network in to blocks - All the addresses with the
  • Length of the prefix is important
  • Network with shorter prefix is larger
  • Network with longer prefix has lesser number of addresses
  • 22 bit network - By just looking at IP address , it is difficult to identify the type of network
  • /24 ~ 256 addresses
  • Another way to write - Subnet mask 1’s on the left and 0’s on the right
  • IPV4 - 32 bit values. decimal dotted quads
  • /24 -255.255.255.0
  • /16 - 255.255.0.0
  • Prefixes need not be whole octets
  • AWS subnets - default vpc for a user in a region
  • default subnets for a user
  • Addresses kind of belong to hosts. They do not actually belong to hosts. They belong to interfaces. They can have 0 or many interfaces.
  • Every machine has a loop back interface
  • Might have a tunnel
  • VM interface connecting host and guest operating system
  • interfaces
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 9001 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 02:bf:ff:f8:f4:74 brd ff:ff:ff:ff:ff:ff
    inet 172.31.19.219/20 brd 172.31.31.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::bf:ffff:fef8:f474/64 scope link
       valid_lft forever preferred_lft forever

#+end_src>
- Loopback is a special interface that allows hosts to connect to each other
- Home router  - Interface to internet and another interface is your laptop
- Atleast Two interfaces to a linux box on AWS - lo : loopback interface and
  eth0: an ethernet interface
- Router connects two networks
- Most hosts have one interface
- Router will have atleast two hosts
- Default gateway connects to the rest of internet
- Router routes the traffic via Default network
- *ip route show default* - for the router address
- *netstat -nr* - default router address
- private ip addresses come out of the block - Most common on home routers
- NAT - Whenever traffic moves from public to private - What inside devices and
  ports are connected to outside world?
- NAT - Makes it difficult to debug
- Private addresses - Good only on local networks
  - IP 192.168. is behind a NAT router
  - Private use  - Never used in the public internet
  - Web server do not see the private address
- Command for AWS Ubuntu public ip
#+begin_src 
curl http://169.254.169.254/latest/meta-data/public-ipv4
curl ipecho.net/plain ; echo
  • Do not need NAT with ipv6 - between 2012 to 2015 - Increased to 10% of all traffic
  • More home and mobile users have v6 than business and office users

Protocol Layers

  • TCP is built on IP protocol
  • TCP middle layer of networking protocols
  • HTTP and other are built on top of TCP
  • Developer - You are working at one abstraction
  • Flask - URLs for resources/methods/VERBs - Any problems with the lower layer is visible
  • OS TCP implementation - Browsers merely use the TCP implemented in the OS
  • TCP relies on IP
  • IP relies on physical devices
  • ping uses ICMP protocol
  • DNS UDP protocol
  • These protocols are available in the operating system
  • Wifi has nothing to do with TCP
  • TCP has nothing to with HTTP session
  • pcap filter
  • tcpdump to monitor applications
    • Look at all the IP packets
    • How much data has been sent ?
    • Overhead
    • Out of all the packets, only a few have payload
    • TCP - Even before the client
  • One exchange at HTTP is a bunch of requests at a lower level
  • What happens when TCP connection happens :
  • Each end point puts in a sequence number
  • For each packet, the client will send a separate ack
  • Even though the client is not transmitting specific stuff, there will be packet exchanges
  • What is the need for putting sequence networks ?
  • Each end point in the operating system keeps a space and memory
    • Keeps the sequence number
    • Packets can be sent again
  • TCP handshake : This exchange of three packets is usually called the TCP three-way handshake
  • In a long-running connection, there will be many packets exchanged back and forth. Some of them will contain application data; others may be only acknowledgments with no data (length 0). However, all TCP packets in a connection except the initial SYN will contain an acknowledgment of all the data that the sender has received so far. Therefore, they will all have the ACK flag set. (This is why tcpdump depicts the ACK flag with just a dot: it’s really common.)
  • Four way teardown
    • When either endpoint is done sending data into the connection, it can send a FIN packet to indicate that it is finished. The other endpoint will send an ACK to indicate that it has received the FIN. In the example HTTP data, the client sends its FIN first, as soon as it is done sending the HTTP request. This is the first packet containing Flags [F.]. Eventually the other endpoint will be done sending as well, and will send a FIN of its own. Then the first endpoint will send an ACK.
  • Why do Packets drop ?
    • Not an option at all
    • Two different networks connected by a slow network
    • Slow network can
    • Routers cannot buffer stuff because of bottlenecks
    • TCP and Routers do not do it that way
    • TCP does not send data at full blast. It send only based on how quickly the ack is received. Also routers drop packets to signal congestion.
    • If the router queues up, then the
    • TCP congestion control
    • As it is trying to get through, TCP sends packets slower and slower
    • One of the reasons for packet loss is congestion and TCP would not want to through
    • TCP has a lot of built in time outs
  • Python requests library - timeout
  • TCP Session time out between browser and web server ?
  • Why time out ?
    • Other host is powered off
    • Connection between you and internet : Routers drop off the message
    • DNS only used once to connect stuff and later DNS is not used at all
  • Design application so that failures will happen

Big Networks

  • Routers
  • traceroute
  • ping is not the only measurement across the network
  • safety feature - traceroute
    • every packet has a TTL field, is reduced by 1 each packet moves across a router
    • If router are misconfigured, eventually the TTL will go to 0 and then expire
    • When a packet TTL expires, the router will send a error message to the sender
    • traceroute works on TTL - sending progressively increasing TTL until the message gets expired
    • ping will give you a binary outcome
    • traceroute will help you investigate the problem
  • Speed of a network connection
    • Bound by speed of light
    • Takes sometime to
    • Gracehopper - nanoseconds
  • Slow bandwidth
    • Bandwidth might not matter at all if all you are looking at text
  • Slow cell phones – dial up
  • Nanoseconds - explained by Admiral Grasshopper
  • Understood about nanoseconds from Admiral Grasshopper’s video

Computing pioneer Grace Hopper, inventor of the compiler, searched for a concrete way to create an intuitive understanding of just how fast is a nanosecond, a billionth of a second, which was the speed of their new computer circuits. As an illustration she settled on the length of wire that is as long as light can travel in one nanosecond. The length is a very portable 11.8 inches. A microseconds worth of wire is a still portable, but a much bulkier 984 feet. In one millisecond light travels 186 miles, which only Hercules could carry. In today’s terms, at a 3.06 GHz clock speed, there’s .33 nanoseconds between ticks, or 3.73 inches of light travel.

  • Bandwidth and Latency
    • Bandwidth (bits/seconds) times latency(seconds)
    • Bandwidth is the flow rate of water per second
    • Latency is the delay in seconds that it takes to get through
  • Beaming data - 1Gbit/seconds
  • Congestion : All want to send data and the router has to decide what traffic to send. If an IP router has more traffic to send than the other, then it drops the network packets
  • Packet loss are not only due to noisy lines but also due to router congestion
  • Sending more slowly of the same info if the router drops packets
  • NAT and Proxy servers

Takeaways

It was refresher for me as I had forgotten some basic aspects of networking

  • Subnetting
  • DNS
  • IP addressing schemes
  • pinging
  • traceroute
  • Bandwidth
  • Latency
  • NAT
  • Proxy Servers
  • Basic idea of a nanosecond 11.8 feet
  • Packets
  • ping is built in to OS that knows how to connect to internet
  • netcat
  • Largest port that one can listen on, Smalles port that one can listen on
  • dig
  • Basic stack