CONTACT US

The intricacies of WUDO (Windows Update Delivery Optimization) – Chapter 4 – Network

In this chapter, we delve into the details of Delivery Optimization’s (DO) network communication, building on the insights from previous chapters. We begin with information from Microsoft and proceed to summarize our extensive black box and reversing research.

Introduction

In this chapter, we’ll discuss DO’s network communication, concluding the information gathered in the last three chapters. The first section outlines information provided by Microsoft, and the rest of the article summarizes our black box and reversing research.

As most of the reversing process was explained in the previous post, this chapter will be more concise, with the goal of documenting the internal implementations of Delivery Optimization.

TL;DR – Graphical flow

In the introduction chapter, we showed the following peek of the DO protocol. The diagram shows a communication record that we sniffed between two peers; the sniff can also be found in our repository.
Now it’s time to answer the questions you might have had about the flow and protocol specifications, so without further ado, let us begin.

FIGURE 1 - DO PROTOCOL
Figure 1 – DO Protocol

Documentation

How does a machine download an update via Delivery Optimization?

In this documentation Microsoft explains how the process works. Basically, when a peer wants to download a file, the process can be spilt into two.

  1. Client-to-Server: Retrieve the file’s metadata from the server – this consists of a list of hashes for verification.
  2. Peer-to-Peer: Connect to peers and exchange the file’s pieces.

But how does it tick? For example, how does a machine discover peers to connect to?
In the following sections, we’ll make public the internal logic and structure of Delivery Optimization.

High-level overview

Client-to-Server

PhaseDirectionPurpose
Content metadataclient → serverRequest metadata of file.
server → clientReply with file metadata. If the file isn’t shared via DO, the server returns an error code.
  Join requestclient → serverJoin the group of peers that are downloading and sharing the file.
server → clientServer replies with a list of peers to connect to. Server internally saves peer, to refer future peers to it.

Peer-to-Peer

The DO peer-to-peer protocol creates a decentralized network of peers that share Swarm files. Each Swarm is split into pieces, and when each piece is downloaded, the hash is validated against the hash received from a Microsoft server.

When connecting to a peer to request a file, a ‘SwarmHash’ is passed in the initial handshake to identify the required update file.

If the peer has that file, or part of it, the peer replies with its own handshake; if it does not have part of the file, the peer terminates the TCP session. This means that in order to connect to a peer, you must know its IP, and which Swarms it has. Once a connection is established, each peer can request chunks of the file from the other peer.

The DO service listens on TCP port 7680 using IPv4 and IPv6, and other services can connect to it, or it can connect to them. Each TCP connection handles a single Swarm file, and peers communicate over asynchronous messages over the protocol. In the following sections, we’ll review some important messages; the full list appears later in this chapter.

Handshake message

The first message that is sent – after the initial TCP handshake – is the handshake message. This message contains the ‘SwarmHash’ that identified the shared file, and the ‘PeerId’ of the sender. The file isn’t identified by its hash as you might assume – that would make it too easy for us researchers! Instead, the ‘SwarmHash’ must be the ‘HashOfHashes’ that was received from the ‘cp*.prod.do.dsp.mp.microsoft.com’ server (more about this in the following Client-Server section), and is stored in the registry archive under the name ‘HashID’. Unfortunately, although the protocol is plaintext, one cannot easily determine from a sniff which file is being downloaded; only its ‘HashOfHashes’ is revealed.

FIGURE 2 - HANDSHAKE MESSAGES
Figure 2 – Handshake messages

BitField message

After the dual handshake, each side sends a BitField message. This message contains the details of which pieces of the file each side has, indicating to the peer what other pieces can be requested.

FIGURE 3 - BITFIELD MESSAGES
Figure 3 – BitField messages

Request and Piece messages

Request – An asynchronous request for the peer to send a specific piece.

Piece – A response to a request message, containing the piece. Usually sent over multiple TCP segments.

FIGURE 5 - REQUEST AND PIECE MESSAGES
Figure 5 – Request and Piece messages

Peer-to-Peer – low level

Untrusted peer protocols

Before delving into the protocol, I’d like to raise a theoretical question: What would be the security issues for a file-sharing protocol? They’d probably be something like this:

  • Peers cannot be trusted to send the file, because they might send malformed or even malicious files.
  • Peers cannot be trusted at all, because they may attempt to leak data from a peer, or leverage the protocol as an attack surface for vulnerabilities.

With these assumptions in mind, let’s begin.

Protocol structure

Each message is a C-style structure, sent as plaintext without encryption or compression. As with other network protocols, it uses big-endian, also known as network-endian.

The handshake message has a format unique from other messages.

Handshake
NameTypeDescription or default value
ProtocolNameLenBYTELength of szProtocolString. Always 14, but can be any number less than 0x32 and not 0x16.
ProtocolNameCHAR[ProtocolNameLen]A string with the size of ‘ProtocolNameLen’. Always ‘Swarm protocol’, but the actual code doesn’t verify this.
SizeQWORDAlways 0x100000, which probably states the size of pieces. Never used by the code, so can be any value.
SwarmHashBYTE[0x20]File identifier, ‘HashOfHashes’ of the file downloaded by this protocol.
PeerIdGUID + DWORDUnique identifier of client, the suffix DWORD is always 0.

Other messages always start with a DWORD that contains the size of the message (not including the DWORD size). The most basic is the KeepAlive message.

KeepAlive: Keep connection alive.

KeepAlive  
nameTypeDescription or default value
MessageSizeDWORDFollowing message size, must be 0 for KeepAlive

All other messages contain an extra BYTE that indicates the message type. Messages are sorted by ‘MessageId’, and not by logical order.

Choke: Stop all data transactions. This is the default state, because DO is bandwidth-aware.

Choke  
nameTypeDescription or default value
MessageSizeDWORD1
MessageIdBYTE0

Unchoke: Allow data transactions.

Unchoke  
nameTypeDescription or default value
MessageSizeDWORD1
MessageIdBYTE1

Interested: Client is interested in downloading data from peer.

Interested  
nameTypeDescription or default value
MessageSizeDWORD1
MessageIdBYTE2

NotInterested: Client is no longer interested in downloading data from peer, but doesn’t wish to terminate the session.

NotInterested  
nameTypeDescription or default value
MessageSizeDWORD1
MessageIdBYTE3

Have: Client notifies peer that it downloaded a piece from another source.

Have  
nameTypeDescription or default value
MessageSizeDWORD5
MessageIdBYTE4
PieceIndexDWORDPiece number, index in BitField

BitField: Send list of pieces client has or doesn’t have.

BitField  
nameTypeDescription or default value
MessageSizeDWORD1 + sizeof(‘PiecesBitField’)
MessageIdBYTE5
PiecesBitFieldBIT[number of pieces]Array of bits, each bit is a boolean indicating whether the sender has the piece of that index

Request: Request piece from peer.

Request  
nameTypeDescription or default value
MessageSizeDWORD13
MessageIdBYTE6
PieceIndexDWORDPiece number, index in BitField
PieceStartOffsetDWORDOffset client requests from inside the piece
PieceSizeDWORDSize of piece from offset, usually 0x100000 (1MB) and limited to 2MB

Piece: Send piece to peer.

Piece  
nameTypeDescription or default value
MessageSizeDWORD9 + sizeof(PieceBuffer)
MessageIdBYTE7
PieceIndexDWORDPiece number, index in BitField
PieceStartOffsetDWORDOffset from inside the piece
PieceBufferBYTE[MessageSize-9]Piece buffer

Cancel: Cancel requested piece from peer. This can be in the middle of a transfer, as the transfer is split into multiple TCP segments.

Cancel  
nameTypeDescription or default value
MessageSizeDWORD13
MessageIdBYTE8
PieceIndexDWORDPiece number, index in BitField
PieceStartOffsetDWORDOffset from inside the piece
PieceSizeDWORDSize of piece

HeapSpraying (not the official name): Empty command of unlimited size, nothing is done with the received buffer.

HeapSpraying (not the official name) 
nameTypeDescription or default value
MessageSizeDWORD1 + sizeof(Buffer)
MessageIdBYTE8
BufferBYTE[MessageSize-1]Buffer of any length

Writing a Wireshark dissector

Unfortunately, dissectors are only supported in LUA or C; I chose to write in LUA. As I had only heard nightmare-like rumors about LUA, I expected the worst, but thanks to Mika Sundland’s great guide, the process was surprisingly easy. The dissector that I created can be found in our GitHub repository.

The graphical flows shown above contain Wireshark snippets created by this dissector.

FIGURE 6 - DISSECTOR EXAMPLE
Figure 6 – Dissector example

Later we wanted to integrate DO parsing directly into Wireshark which required developing a native C dissector.
The dissector source code can be found here in my Wireshark fork, it was merged into Wireshark and hopefully will be integrated into the next version.
A compiled DLL plugin can also be found in the GitHub repo here.

FIGURE 7 - C DISSECTOR
Figure 7 – C Dissector

My two cents on the protocol

Well done Microsoft, well done (slow claps).

Security by Rudimentary

The protocol is quite bare bones, with no encryption, compression, or third-party libraries. It’s my guess that this design was chosen to limit the new and unique attack surface that this protocol provides to attackers.

In most cases, encryption and compression are crucial for security and speed; however, in this case, I cannot emphasize enough the security benefits of such an approach.

This protocol allows LAN connections over IPv4, and WAN connection over IPv6 and Teredo, creating a unique attack surface. To reduce exploitation options, this protocol limits the attack surface by only allowing old-school structure-based communications.

If the protocol was encrypted or compressed, it could potentially open up a whole new world of vulnerabilities, as SMB v3.1.1. was susceptible to the SMBGhost vulnerability because of its compression layer.

But wait – isn’t encryption good for security? Isn’t compression crucial for speed and bandwidth?

It turns out that in this specific case, the answer to both those questions is ‘no’:

  • Given that attackers can join the decentralized network, the benefits of encryption – such as preventing MITM – are diminished. Instead of encryption, files are hashed by Microsoft to prevent malicious interference.
  • As most payloads will be compressed files, there usually isn’t a need for another compression layer.

Control

Microsoft’s tight hold on DO makes hijacking the protocol a complicated task, because all communications start with some form of communication to Microsoft services. In other networks, such as torrents, anyone can add their malicious files; in DO, only Microsoft can do that.

Network security

As we discovered in the previous chapter, the usage of size fields is minimized as much as possible, to prevent malicious peers from sending incorrectly-sized messages. When size and offset are provided, the message is verified and handled using secure functions.

Privacy

There’s no way of getting around the fact that DO leaks update information. However, it’s not easy to leak information in this way. You can only connect to and pass the initial handshake if you receive the correct ‘SwarmHash’ from Microsoft’s server – you can’t just connect and read which updates the remote machine has and doesn’t have. In Chapter 6, we’ll discuss and demonstrate how to achieve remote update information, despite the difficulties.

Client-to-Server – low level

Before accessing other peers, the process begins by querying Microsoft servers.

Content metadata

The client begins with the file’s SHA1 and download URL. It receives these from the component that requested the download. Initially, the client checks whether the file is shared in the DO network, by requesting its metadata from ‘cp*.prod.do.dsp.mp.microsoft.com’.

Example GET request:

GET https://cp801.prod.do.dsp.mp.microsoft.com/v3/content?Id=cb94ac42591f1a80852587f959357bd6b5b9eb7e&doClientVersion=10.0.19041.1266&altCatalogId=http%3A%2F%2Fau.download.windowsupdate.com%2Fc%2Fmsdownload%2Fupdate%2Fsoftware%2Fdefu%2F2021%2F11%2Fam_delta_cb94ac42591f1a80852587f959357bd6b5b9eb7e.exe&countryCode=IL&profile=1114112&CacheId=1 HTTP/1.1
Connection: Keep-Alive
Accept: */*
Accept-Encoding: gzip, deflate
User-Agent: Microsoft-Delivery-Optimization/10.0
MS-CV: maBT6stApEqncwu2.1.1.2.2.2.1.1.1
Content-Length: 0
Host: cp801.prod.do.dsp.mp.microsoft.com

Important headers

User-Agent – The user agent is specific to DO.

MS-CV – This is used for event tracing. When replicating the GET request myself, I discovered that I could just copy-paste the same string each time. This will help Microsoft to monitor anyone using my scripts, but as a Security Researcher, that isn’t a bad thing – as long as it isn’t blocked.

For more information on MS-CV, see:

https://github.com/microsoft/CorrelationVector

Important parameters

Id – The SHA1 hash of the file; this is the identifier of the file.

altCatalogId – The URL from which to download the file. Sometimes the ‘Id’ is enough, and this parameter can be ignored, but sometimes this parameter is required for the server to reply. In this example, the URL contains the name of the file, which indicates that ‘am_delta_cb94ac42591f1a80852587f959357bd6b5b9eb7e.exe’ is requested; this is a ‘Windows Security’ update. Some URLs weren’t specific, but downloading them in Chrome reveals their real name. I was surprised that Chrome downloads worked for this purpose, as Microsoft could block any request that is not accompanied by the correct user agent or MS-CV. Perhaps Microsoft allows such downloads because it wouldn’t prevent our access altogether, and would just make our lives harder.

profile – A unique profile number, which like ‘MS-CV’ can just be copy-pasted.

The server responds with a json, for example:

HTTP/1.1 200 OK
Content-Type: text/json
Server: Microsoft-IIS/10.0
X-AspNet-Version: 4.0.30319
X-Powered-By: ASP.NET
Vary: Accept-Encoding
Content-Encoding: gzip
Content-Length: 396
Cache-Control: max-age=18000
Date: Mon, 29 Nov 2021 07:50:02 GMT
Connection: keep-alive
{
‘ContentId’: ‘DO-hc4_fekzaz4jml79P2Db-ziNS36y8iKMOrkgaUEMGH4=‘,
HashOfHashes’: ‘hc4+fekzaz4jml79P2Db/ziNS36y8iKMOrkgaUEMGH4=‘,
PiecesHashFileCdnUrls’: [
http://emdl.ws.microsoft.com/emdl/c/doc/ph/prod5/msdownload/update/software/defu/2021/11/1024/am_delta_cb94ac42591f1a80852587f959357bd6b5b9eb7e.exe.json
],
‘ContentCdnUrls’: null,
‘IsSecure’: ‘False’,
‘IsInternal’: ‘False’,
‘Policies’: {
‘ForegroundQosBps’: ‘6710886’,
‘BackgroundQosBps’: ‘2621440’,
‘MaxCacheAgeSecs’: ‘259200’,
‘ExpireAtSecsSinceEpoch’: ‘‘,
‘DownloadToExpire’: ‘86400’,
‘ContentDownloadMode’: ‘0’
},
‘Rank’: 0.0,
‘InfoCode’: 0
}

Important results

PiecesHashFileCdnUrls – A URL to download the metadata file. These are the files found in the cache folder described in Chapter 2! It’s nice to see everything connecting together. This downloads a file containing the hash of each piece of the file. When retrieved over the peer-to-peer protocol, each piece will be verified. Note that the request uses HTTP, and is susceptible to a MITM attack that could provide different hashes.
HashOfHashes – The main result, the Id of the ‘Swarm’, the ‘SwarmHash’. This will be used later in the peer-to-peer protocol to request the file.

Join request

The join request is used to enter the group of peers that are currently downloading and sharing a single file. The machine’s ‘contact info’ is also sent, so that future peers can connect to it.

The server responds with a JSON.

POST https://array507.prod.do.dsp.mp.microsoft.com/join/ HTTP/2.0
accept: /
user-agent: Microsoft-Delivery-Optimization/10.0
ms-cv: jbCdmrZDLk2zie6O.1.1.6.2.6.1.1
content-length: 694
{
ContentId’: ‘c67ff11995836a2718e033a3aca0faed72199626’,
AltCatalogId’: ‘http://au.download.windowsupdate.com/c/msdownload/update/software/updt/2021/10/windows10.0-kb5007289-x64-ndp48_c67ff11995836a2718e033a3aca0faed72199626.cab‘,
PeerId’: ‘29f4acd3bf96fe41bdd161e08f0ef28800000000’,
ReportedIp’: ‘192.168.80.128’,
‘SubnetMask’: ‘255.255.255.0’,
Ipv6’: ‘2001:0:removed ‘,
‘IsBackground’: ‘1’,
‘ClientCompactVersion’: ‘10.0.19041.746’,
‘Uploaded’: ‘0’,
‘Downloaded’: ‘0’,
‘DownloadedCdn’: ‘0’,
‘DownloadedDoinc’: ‘0’,
‘Left’: ‘68429383’,
‘JoinRequestEvent’: ‘1’,
‘RestrictedUpload’: ‘0’,
‘PeersWanted’: ‘50’,
‘GroupId’: ‘‘,
‘Scope’: ‘1’,
‘UploadedBPS’: ‘0’,
‘DownloadedBPS’: ‘0’,
‘Profile’: ‘1114112’,
‘Seq’: ‘0’
}

Important parameters:

ContentId – The SHA1 hash of file.

altCatalogId – As in the above instance, this is an optional URL of the file; in practice, both parameters are sent. In this example, we can learn that this is about the ‘windows10.0-kb5007289’ update.

PeerId –The unique Id of the Swarm peer. As we’ll see in the peer-to-peer protocol, this is a hex-encoded GUID, padded by a zero DWORD.

ReportedIP – The internal IP of the client. This is interesting, as internal addresses are disclosed here to Microsoft. I hope they keep this information to themselves.

Ipv6 – The client’s IPv6. Note that it’s a Teredo address. Teredo tunneling allows tunnelling IPv6 over IPv4. This is significant, as it bypasses the NAT, allowing external connections to connect to a local port.

FIGURE 8 - TEREDO IN A NUTSHELL
Figure 8 – Teredo in a nutshell

The response is also a json, containing a list of peers to connect to.

HTTP/2.0 200
cache-control: private
content-type: text/html
server: Microsoft-IIS/10.0
x-content-type-options: nosniff
x-aspnet-version: 4.0.30319
x-powered-by: ASP.NET
date: Thu, 25 Nov 2021 16:31:39 GMT
content-length: 2918
{
‘FailureReason’: null,
‘NextJoinTimeIntervalInMs’: 60000,
‘Complete’: 0,
‘Incomplete’: 0,
‘Rediscover’: false,
‘KVVersion’: ‘FC56C82EA8C7B0E78411A6E372D1BCDA9D99503936434A231CC0300EE87AAD7A’,
‘GeoVersion’: ‘FC01AB38C8785F45CA7CEF644C8A6BFC607C97DF07EDFFC753F84A7C48B00EDE’,
‘Peers’: [
{
PeerId’: ‘050e6fb5203d644f909d1e8d30a4c3dd00000000’,
‘Type’: 180,
Ip’: ‘removed.removed.removed.removed’,
‘Port’: 7680,
Ipv6’: ‘2001:0000:removed’,
‘InternalIp’: ‘‘,
ExternalIp’: ‘removed’,
‘Ipv6Port’: 7680,
‘InternalPort’: 0,
‘ExternalPort’: 7680
},
………….
],
‘Leave’: false
}

Important results

The result contains a list of potential peers to connect to. When used behind a large NAT, this can be a very useful mapping of clients in the network, without performing any active scans. For internet IPs, access will usually be blocked to hosts behind a NAT – and indeed, most of my sniffs contained multiple failed attempts to connect to external IPs. However, as stated above, Teredo is used to bypass NATs.

PeerId – The Id of each remote peer, required for the peer-to-peer handshake message. Note that this Microsoft join server is the only source for this Id of remote peers. This peer ID information is required to connect to a remote PC.

Ip – The remote external IP.

Ipv6 – The remote IPv6. Note again: this is a Teredo address.

ExternalIP – Same as the Ip field – although they might sometimes differ.

Direct downloads

We might forget about this standard practice in the context of DO, but updates are traditionally downloaded directly. If no peer is found, Windows reverts to downloading from Microsoft servers via HTTP and HTTPS downloads. Yes, you read that correctly: HTTP. This seems insecure, as downloads could be intercepted and replaced with malicious files. To resolve this, updates are signed by Microsoft, and verified before installation, so that any patched updates will be discarded. Checking whether updates are indeed signed and that the signature is verified is not in the scope of this article, but it’s an interesting thread to explore.

Why would Microsoft choose insecure HTTP over HTTPS? Money, of course. These file servers handle every Windows PC in the world, and using HTTPS adds cryptographic steps that cost slightly more in CPU usage – and in the grand scheme of things, this is a significant expense.

Ethereum Swarm

In Chapter 2, we suspected that Windows was using the Ethereum Swarm protocol, or some fork of it. I found no evidence of this protocol in the code written by Microsoft in ‘DoSvc.dll’ – all the code seemed native to Windows. Neither could I find evidence of code from ‘libp2p’ or ‘devp2p’ (Ethereum Swarm underlay protocols). The whole Microsoft Swarm protocol seems to be unique to Microsoft. However, there are uncanny similarities in the terminology used both by DO in the documentation, and in Ethereum Swarm. The term ‘Swarm’ is used in both ‘dialects’ to represent a shared file, and even specific concepts such as ‘SwarmHash’ are shared.

In the previous reversing chapter, we found evidence of an older, unsupported protocol whose name has a length of 0x16 – as you might remember, a handshake cannot begin with a string of that size. Perhaps Windows used to depend on the Ethereum Swarm, and it was replaced by 100% Microsoft code.

Summary

In this chapter, we discussed how DO requests information from Microsoft servers, and then connects to other remote peers. I have to be honest, Microsoft did a great job implementing this. Although there is information leakage – which allows us to know that remote peers are downloading updates and might not yet be updated – the leakage is limited to the bare bones requirements. Nonetheless, in Chapter 6, we’ll discuss how we can still abuse the protocol.

In the next article, we’ll fuzz the protocol in search of CVEs (evil laugh).  

subsctibe decor
Want to get in touch?