`
leonzhx
  • 浏览: 793983 次
  • 性别: Icon_minigender_1
  • 来自: 上海
社区版块
存档分类
最新评论

Chapter 1. Basic Network Concepts

阅读更多

1.  Each machine on a network is called a node. Nodes that are fully functional computers are also called hosts.

 

2.  Addresses are assigned differently on different kinds of networks. Manufacturers of Ethernet hardware use preassigned manufacturer codes to make sure there are no conflicts between the addresses in their hardware and the addresses of other manufacturers’ hardware.

 

3.  Internet addresses are normally assigned to a computer by the organization that is responsible for it. However, the addresses that an organization is allowed to choose for its computers are assigned by the organization’s Internet service provider (ISP). ISPs get their IP addresses from one of four regional Internet registries which are in turn assigned IP addresses by the Internet Corporation for Assigned Names and Numbers (ICANN).

 

4.  On some kinds of networks, nodes also have text names that help human beings identify them. One address can have several names and one name can refer to several different addresses.

 

5.  All modern computer networks are packet-switched networks: data traveling on the network is broken into chunks called packets and each packet is handled separately. Each packet contains information about who sent it and where it’s going.

 

6.  A protocol is a precise set of rules defining how computers communicate: the format of addresses, how data is split into packets, and so on. Open, published protocol standards allow software and equipment from different vendors to communicate with one another.

 

7.  To hide most of the complexity from the application developer and end user, the different aspects of network communication are separated into multiple layers. Each layer represents a different level of abstraction between the physical hardware and the information being transmitted. In theory, each layer only talks to the layers immediately above and immediately below it. While the middle layer protocols are fairly consistent across most of the Internet today, the top and the bottom vary a lot. The key is that from the top of the stack, it doesn’t really matter what’s on the bottom and vice versa. The layer model decouples the application protocols from the physics of the network hardware and the topology of the network connections.




 

8.  The standard TCP/IP four-layer model is appropriate for the Internet. In this model, applications like Firefox and Warcraft run in the application layer and talk only to the transport layer. The transport layer talks only to the application layer and the Internet layer. The Internet layer in turn talks only to the host-to-network layer and the transport layer, never directly to the application layer. The host-to-network layer moves the data across the wires, fiber-optic cables, or other medium to the host-to-network layer on the remote system, which then moves the data up the layers to the application on the remote system.




9.  It’s entirely possible that data sent across the Internet will pass through several routers and their layers before reaching its final destination.

 

10.  90% of the time your Java code will work in the application layer and only need to talk to the transport layer. The other 10% of the time, you’ll be in the transport layer and talking to the application layer or the internet layer. The complexity of the host-to-network layer is hidden from you.

 

11.  The primary reason you’ll need to think about the host-to-network layer or the physical layer is performance. For instance, if your clients reside on fast, reliable fiber-optic connections, you will design your protocol and applications differently than if they’re on high-latency satellite connections on an oil rig in the North Sea. You’ll make still different choices if your clients are on a 3G data plan where they’re charged by the byte for relatively low bandwidth. And if you’re writing a general consumer application that could be used by any of these clients, you’ll try to hit a sweet spot somewhere in the middle, or perhaps even detect and dynamically adapt to individual client capabilities. However, whichever physical links you encounter, the APIs you use to communicate across those networks are the same. What makes that possible is the internet layer.

 

12.  A network layer(Internet Layer) protocol defines how bits and bytes of data are organized into the larger groups called packets, and the addressing scheme by which different machines find one another. The Internet Protocol (IP) is the most widely used network layer protocol in the world and the only network layer protocol Java understands.

 

13.  In both IPv4 and IPv6, data is sent across the internet layer in packets called datagrams. Each IPv4 datagram contains a header between 20 and 60 bytes long and a payload that contains up to 65,515 bytes of data. An IPv6 datagram contains a larger header and up to four gigabytes of data. 

 

14.  The following chart shows how the different quantities are arranged in an IPv4 datagram. All bits and bytes are big endian; most significant to least significant runs left to right:



 

 

15.  Besides routing and addressing, the second purpose of the Internet layer is to enable different types of Host-to-Network layers to talk to each other. The internet layer is responsible for connecting heterogenous networks to each other using homogeneous protocols.

 

16.  The transport layer is responsible for ensuring that packets are received in the order they were sent and that no data is lost or corrupted. There are two primary protocols at this level. The first, the Transmission Control Protocol (TCP), is a high-overhead protocol that allows for retransmission of lost or corrupted data and delivery of bytes in the order they were sent. The second protocol, the User Datagram Protocol (UDP), allows the receiver to detect corrupted packets but does not guarantee that packets are delivered in the correct order (or at all).

 

17.  The application layer decides what to do with the data after it’s transferred. For example, an application protocol like HTTP (for the World Wide Web) makes sure that your web browser displays a graphic image as a picture, not a long stream of numbers. The application layer is where most of the network parts of your programs spend their time. In addition, your programs can define their own application layer protocols as necessary.

 

18.  IP, the Internet protocol, was developed with military sponsorship during the Cold War, and ended up with a lot of features that the military was interested in. First, it had to be robust. IP was designed to allow multiple routes between any two points and to route packets of data around damaged routers. Second, IP had to be open and platform-independent. Different kinds of computers had to be able to talk to one another.

 

19.  TCP was layered on top of IP to give each end of a connection the ability to acknowledge receipt of IP packets and request retransmission of lost or corrupted packets. Furthermore, TCP allows the packets to be put back together on the receiving end in the same order they were sent. TCP, however, carries a fair amount of overhead. Therefore, if the order of the data isn’t particularly important and if the loss of individual packets won’t completely corrupt the data stream, packets are sometimes sent without the guarantees that TCP provides using the UDP protocol.

 

20.  ICMP, the Internet Control Message Protocol, which uses raw IP datagrams to relay error messages between hosts. The best-known use of this protocol is in the ping program. Java does not support ICMP, nor does it allow the sending of raw IP datagrams (as opposed to TCP segments or UDP datagrams). The only protocols Java supports are TCP and UDP, and application layer protocols built on top of these. All other transport layer, internet layer, and lower layer protocols such as ICMP, IGMP, ARP, RARP, RSVP, and others can only be implemented in Java programs by linking to native code.

 

21.  IPv6 uses 16-byte addresses. IPv6 addresses are customarily written in eight blocks of four hexadecimal digits separated by colons. Leading zeros do not need to be written. A double colon, at most one of which may appear in any address, indicates multiple zero blocks. For example, FEDC:0000:0000:0000:00DC:0000:7076:0010 could be written more compactly as FEDC::DC:0:7076:10. In mixed networks of IPv6 and IPv4, the last four bytes of the IPv6 address are sometimes written as an IPv4 dotted quad address. For example, FEDC:BA98:7654:3210:FEDC:BA98:7654:3210 could be written as FEDC:BA98:7654:3210:FEDC:BA98:118.84.50.16.

 

22.  It’s also possible, although less likely, for an IP address to change while the program is running (e.g., if a DHCP lease expires), so you may want to check the current IP address every time you need it rather than caching it.

 

23.  All IPv4 addresses that begin with 10., 172.16. through 172.31. and 192.168. are unassigned. They can be used on internal networks, but no host using addresses in these blocks is allowed onto the global Internet. IPv4 addresses beginning with 127 (most commonly 127.0.0.1) always mean the local loopback address. That is, these addresses always point to the local computer, no matter which computer you’re running on. The hostname for this address is often localhost. In IPv6, 0:0:0:0:0:0:0:1 (a.k.a. ::1) is the loopback address. The address 0.0.0.0 always refers to the originating host, but may only be used as a source address, not a destination. Similarly, any IPv4 address that begins with 0. (eight zero bits) is assumed to refer to a host on the same local network.

 

24.  The IPv4 address that uses the same number for each of the four bytes (i.e., 255.255.255.255), is a broadcast address. Packets sent to this address are received by all nodes on the local network, though they are not routed beyond the local network. This is commonly used for discovery. For instance, when an ephemeral client such as a laptop boots up, it will send a particular message to 255.255.255.255 to find the local DHCP server. All nodes on the network receive the packet, but only the DHCP server responds. In particular, it sends the laptop information about the local network configuration, including the IP address that laptop should use for the remainder of its session and the address of a DNS server it can use to resolve hostnames.

 

25.  Each computer with an IP address has several thousand logical ports (65,535 per transport layer protocol, to be precise). These are purely abstractions in the computer’s memory and do not represent anything physical. Each port is identified by a number between 1 and 65535. Each port can be allocated to a particular service.

 

26.  Port numbers between 1 and 1023 are reserved for well-known services like finger, FTP, HTTP, and IMAP. On Unix systems, including Linux and Mac OS X, only programs running as root can receive data from these ports, but all programs may send data to them. On Windows, any program may use these ports without special privileges.

 

27.  The following table shows the well-known ports for the protocols. On Unix systems, a fairly complete listing of assigned ports is stored in the file /etc/services.

 

Protocol

Port

Protocol

Purpose

echo

7

TCP/UDP

Echo is a test protocol used to verify that two machines are able to connect by having one echo back the other’s input.

Discard

9

TCP/UDP

Discard is a less useful test protocol in which all data received by the server is ignored.

daytime

13

TCP/UDP

Provides an ASCII representation of the current time on the server.

FTP data

20

TCP

FTP uses two well-known ports. This port is used to transfer files.

FTP

21

TCP

This port is used to send FTP commands like put and get.

SSH

22

TCP

Used for encrypted, remote logins.

telnet

23

TCP

Used for interactive, remote command-line sessions.

SMTP

25

TCP

The Simple Mail Transfer Protocol is used to send email between machines.

time

37

TCP/UDP

A time server returns the number of seconds that have elapsed on the server since midnight, January 1, 1900, as a four-byte, unsigned, big-endian integer.

whois

43

TCP

A simple directory service for Internet network administrators.

finger

79

TCP

A service that returns information about a user or users on the local system.

HTTP

80

TCP

The underlying protocol of the World Wide Web.

POP3

110

TCP

Post Office Protocol version 3 is a protocol for the transfer of accumulated email from the host to sporadically connected clients.

NNTP

119

TCP

Usenet news transfer; more formally known as the “Network News Transfer Protocol.”

IMAP

143

TCP

Internet Message Access Protocol is a protocol for accessing mailboxes stored on a server.

dict

2628

TCP

A UTF-8 encoded dictionary service that provides definitions of words.

 

28.  The Internet is the world’s largest IP-based network. It is an amorphous group of computers in many different countries on all seven continents (Antarctica included) that talk to one another using IP protocols. Intranet loosely describes corporate practices of putting lots of data on internal web servers that are not visible to users outside the local network.

 

29.  Each IP block has a fixed prefix. For instance if the prefix is 216.254.85, then the local network can use addresses from 216.254.85.0 to 216.254.85.255. Because this block fixes the first 24 bits, it’s called A/24. A/23 specifies the first 23 bits, leaving 9 bits for 2^9 or 512 total local IP addresses.

 

30.  Because of the increasing scarcity of and demand for raw IP addresses, most networks today use Network Address Translation (NAT). In NAT-based networks most nodes only have local, non-routable addresses selected from either 10.x.x.x, 172.16.x.x to 172.31.x.x, or 192.168.x.x. The routers that connect the local networks to the ISP translate these local addresses to a much smaller set of routable addresses.

 

31.  The firewall is often part of the router that connects the local network to the broader Internet and may perform other tasks, such as network address translation. Then again, the firewall may be a separate machine.

 

32.  A proxy server has a detailed understanding of some application-level protocols, such as HTTP and FTP. (The notable exception are SOCKS proxy servers that operate at the transport layer, and can proxy for all TCP and UDP connections regardless of application layer protocol.) The following figure shows how proxy servers fit into the layer model:



 

 

33.The biggest problem with proxy servers is their inability to cope with all but a few protocols. Generally established protocols like HTTP, FTP, and SMTP are allowed to pass through, while newer protocols like BitTorrent are not. (Some network administrators consider this a feature.) In the rapidly changing world of the Internet, this is a significant disadvantage. It’s a particular disadvantage for Java programmers because it limits the effectiveness of custom protocols. In Java, it’s easy and often useful to create a new protocol that is optimized for your application. However, no proxy server will ever understand these one-of-a-kind protocols. Consequently, some developers have taken to tunneling their protocols through HTTP, most notably with SOAP.

 

34.  Applets that run in web browsers normally use the proxy server settings of the web browser itself, though these can be overridden in the Java Control Panel. Standalone Java applications can indicate the proxy server to use by setting the socksProxyHost and socksProxyPort properties (if you’re using a SOCKS proxy server), or http.proxySet, http.proxyHost, http.proxyPort, https.proxySet, https.proxyHost, https.proxyPort, ftpProxySet, ftpProxyHost, ftpProxyPort, gopherProxySet, gopherProxyHost, and gopherProxyPort system properties (if you’re using protocol-specific proxies). You can set system properties from the command line using the -D flag, like this: java -DsocksProxyHost=socks.cloud9.net  -DsocksProxyPort=1080 MyClass

 

35.  Java does not have explicit peer-to-peer communication in its core networking API. However, applications can easily offer peer-to-peer communications in several ways, most commonly by acting as both a server and a client. Alternatively, the peers can communicate with each other through an intermediate server program that forwards data from one peer to the other peers. This neatly solves the discovery problem of how two peers find each other.

 

36.  Although there are many standards organizations in the world, the two that produce most of the standards relevant to application layer network programming and protocols are the Internet Engineering Task Force (IETF) and the World Wide Web Consortium (W3C). The IETF is a relatively informal, democratic body open to participation by any interested party. Its standards are based on “rough consensus and running code” and tend to follow rather than lead implementations. IETF standards include TCP/IP, MIME, and SMTP. The W3C, by contrast, is a vendor organization, controlled by dues-paying member corporations, that explicitly excludes participation by individuals. For the most part, the W3C tries to define standards in advance of implementation. W3C standards include HTTP, HTML, and XML.

 

37.  IETF standards and near-standards are published as Requests for Comments (RFCs). RFCs range from informational documents of general interest to detailed specifications of standard Internet protocols such as FTP. RFCs are available from many locations on the Internet, including http://www.faqs.org/rfc/ and http://www.ietf.org/rfc.html.

 

38.  Although many people participate in developing W3C standards, each standard is ultimately approved or vetoed by one individual, W3C director Tim Berners-Lee. The W3C has five basic levels of standards: Note, Working drafts, Candidate Recommendation, Proposed Recommendation and Recommendation.

 

  • 大小: 58.3 KB
  • 大小: 64.8 KB
  • 大小: 89.1 KB
  • 大小: 100.8 KB
分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics