Last time we’ve seen that first 6 bytes are always the same, but what do they mean?
00 00 01 7c 0a 10
I went further today with trying to understand scio. So, when we do the initial secio handshake what we get back is first 4 bytes indicating length of data and the rest is protobuf serialized data of Propose type, which is defined here https://github.com/libp2p/go-libp2p-secio/blob/master/pb/spipe.proto
So the first 4 bytes (in hex),
00 00 01 7c
actually mean 380 which is the length of the data after the first 4 bytes (we receive a total of 384 bytes).
How do we get 380 out of those 4 bytes?
We treat the 4 bytes as a single binary number (bigendian uint32).
00 = 00000000 = 0
00 = 00000000 = 0
01 = 00000001 = 124
7c = 01111100 = 1
(01 7c) = (00000001 01111100) = 380
Or put in an another (more math/code) way:
len = [byte1] * 256^3 + [byte2] * 256^2 + [byte3] * 256^1 + [byte4] * 256^0
len = [byte1] << 3*8 + [byte2] << 2*8 + [byte3] << 8 + [byte4]
In our case
len = 0 * 256^3 + 0 * 256^2 + 1 * 256^1 + 124 * 256^0
len = 1 * 256 + 124 * 1 = 256 + 124 = 380
So that gives us the length of data.
We read that much (380 bytes) of data and what we get is Propose encoded in protobuf.
We’ll focus on the next two bytes:
0a 10
This is already the beginning of Propose encoded in protobuf.
It has 5 fields, numbered or ID-ed from 1 to 5 (field_number
). All of them are length-delimited types (strings, bytes) which are marked in protobuf (wire_type
) as 2.
In protobuf each field starts with a header byte followed by a byte (or more bytes as it’s varint) indicating length of data.
Header is defined similarly to mplex. Righter-most 3 bits are type (wire_type) and the left part (first 5 bits) is the ID (field_number) of the field.
So like in mplex, if we draw it graphically, where ID is D and type is T:
D D D D D T T T
For example type = 2 (string) and ID = 1 would be:
0 0 0 0 1 0 1 0
, 00001 for ID, 010 for type
Codewise we do it exactly the same as in mplex. If we name the header byte as h.
type = h & 0x07
id = h >> 3
In hex 00001010
is 0a
, which is also what we’ve received.
The next byte is length of this data.
Hex 10
stands for 16 and we did notice last time that after the initial “static” bytes, we get 16 bytes of data, which are always different. In the Propose definition it’s defined as rand 16 bytes, which in crypto lang is usually named as nonce.
After we read those 16 bytes of nonce, we’ve finished with the first field, so the next byte 12
is already the header of the next field. I’ll write this header in binary and I’ll put a space between the type and ID part.
0x12 = 00010 010
We can see again that it’s of string type, i.e. 010 = 2, but this time its ID is 2 and not 1. In the definition of Propose it’s the pubkey or public key.
The next two bytes are length and not just one.
ab 02
How do we know it’s 2 and not 1? If we write the two bytes in binary:
0xab = 1 0 1 0 1 0 1 1
0x02 = 0 0 0 0 0 0 1 0
If the leftmost bit is 1, it means that the next byte should be taken in account as well. Which means that if the length of data is above 127, then 2 bytes are needed. (Probably it’s the same in mplex and libp2p protocol and not 254 as I wrote before, but I haven’t yet dwelved into.)
So how do we make sense out of this now?
We can’t just join the numbers as before. It’s a bit more complex.
First we remove the left-most bit in 0xab
.
0xab = [1] 0 1 0 1 0 1 1 => 0x2b = 0 1 0 1 0 1 1
Then we append the next byte 0x02 infront:
(0x02 0x2b) = 00000010 0101011 = 299
We can do in code/math way:
length = [byte2] * 128 + ([byte1] & ~128)
length = [byte2] << 7 + ([byte1] & ~128)
So next 299 bytes are the public key, which I don’t yet understand the format as the key itself is 256 bytes long and 43 bytes are for me unknown.
Next are 3 fields all with a header byte and length byte, but inside they are comma-separate values, which are also ordered in preference.
From now on, I’ll leave out the hex/binary form and just write what we get.
First is exchanges.
We get the ID = 3, type = 2, length = 17. The data is:
P-256,P-384,P-521
After that we have the field ciphers.
We get the ID = 4, type = 2, length = 24. The data is:
AES-256,AES-128,Blowfish
At last we have the field hashes.
We get the ID = 5, type = 2, length = 13. The data is:
SHA256,SHA512