We have recently made first attempts at integrating Helia with our p2p database system, OrbitDB.
In most cases, Helia is running without issue but we are experiencing connection dropouts during block sync-ing and we have isolated the problem to js-ipfs-bitswap. In particular, it appears that the LibP2P connection closes the connection between two browser peers, causing our database sync-ing protocol to break. This issue is not reproducable when running sync-ing between Node.js peers.
To replicate the problem:
Clone OrbitDB, checkout Helia branch and install dependencies:
git clone https://github.com/orbitdb/orbitdb.git
cd ./orbitdb
git checkout helia
npm i
Launch the relay:
npm run webrtc
Run the web browser tests:
npm run test:browser
The test should run successfully.
Next, open the file ./test/orbitdb-replication.test.js and change line 35 to read:
const amount = 85 + 1
Save the changes.
Run the browser test again:
npm run test:browser
It will time out. 86 records seems to be the magic number whereby the sync will no long complete successfully. If it is still running sucessfully, increment to const amount = 128 + 1
. At some point, the number of records that will be sync-ed will cause the sync-ing to hang and time out.
We have traced the problem to Bitswap
When 85 + 1
or greater is specified, it seems to hang Promise.race
eventually times out. In particular, I think it is loadOrFetchFromNetwork
not resolving; I’m guessing the first promise will not resolve while the block is not stored locally. In particular onBlock does not seem to get fired.
When logging is enabled, we are seeing the following errors in msg_queue when sendMessage
gets called:
send error CodeError: stream reset
at MplexStream.reset (/home/haydenyoung/Development/orbitdb/orbit-db/test/browser/webpack:/@orbitdb/core/node_modules/@libp2p/interface-stream-muxer/dist/src/stream.js:144:1)
at MplexStreamMuxer._handleIncoming (/home/haydenyoung/Development/orbitdb/orbit-db/test/browser/webpack:/@orbitdb/core/node_modules/@libp2p/mplex/dist/src/mplex.js:260:1)
at MplexStreamMuxer.sink (/home/haydenyoung/Development/orbitdb/orbit-db/test/browser/webpack:/@orbitdb/core/node_modules/@libp2p/mplex/dist/src/mplex.js:160:1)
at async Promise.all (index 0)
send error Error: Muxer already closed
at MplexStreamMuxer.newStream (/home/haydenyoung/Development/orbitdb/orbit-db/test/browser/webpack:/@orbitdb/core/node_modules/@libp2p/mplex/dist/src/mplex.js:93:1)
at ConnectionImpl.newStream [as _newStream] (/home/haydenyoung/Development/orbitdb/orbit-db/test/browser/webpack:/@orbitdb/core/node_modules/libp2p/dist/src/upgrader.js:312:1)
at ConnectionImpl.newStream (/home/haydenyoung/Development/orbitdb/orbit-db/test/browser/webpack:/@orbitdb/core/node_modules/libp2p/dist/src/connection/index.js:85:1)
at Libp2pNode.dialProtocol (/home/haydenyoung/Development/orbitdb/orbit-db/test/browser/webpack:/@orbitdb/core/node_modules/libp2p/dist/src/libp2p.js:230:1)
at Network._writeMessage (/home/haydenyoung/Development/orbitdb/orbit-db/test/browser/webpack:/@orbitdb/core/node_modules/ipfs-bitswap/dist/src/network.js:190:1)
at Network.sendMessage (/home/haydenyoung/Development/orbitdb/orbit-db/test/browser/webpack:/@orbitdb/core/node_modules/ipfs-bitswap/dist/src/network.js:167:1)
LibP2P is also throwing similar errors regarding mplex and gossipsub but I think the issue with gossipsub is simply a side effect of the underlying connection closing prematurely.
We’re not sure why the connection suddenly closes and we haven’t had much success debugging it. I’m tempted to assume that the underlying LibP2P is the culprit but am not sure why it would drop the connection with such consistency.
I haven’t opened a Github issue yet as I’m first ruling out a configuration issue with our setup.