I’m trying to transfer a lot of files between two nodes. Wrote up a script to use the API to list the pins between them, connect to eachother as peers, then pin add
each file. Works well, but after a seemingly random amount of pin adds, the command will hang indefinetely and persistently. Restarting the script does nothing. Restarting the receiver(Node B) usually allows the process to continue.
They’re both running go-ipfs version 0.4.13 using the IPFS docker image on Ubuntu Linux 16.04 on AWS EC2 instances.
This is some brief diagnosis after the error occurs:
Node A(source):
/ # ipfs id
{
[...]
"Addresses": [
[...]
"/ip4/1.2.3.4/tcp/4001/ipfs/QmSscuC3tyLSCjWDviuff422gb1mmsp1yo1sbt6KrSwFdH"
],
"AgentVersion": "go-ipfs/0.4.13/3b16b74",
"ProtocolVersion": "ipfs/0.1.0"
}
/ # ipfs version --all
go-ipfs version: 0.4.13-3b16b74
Repo version: 6
System version: amd64/linux
Golang version: go1.9.2
Node B(destination):
/ # ipfs id
{
[...]
"Addresses": [
[...]
"/ip4/4.3.2.1/tcp/4001/ipfs/QmQENeAYCR6AL8sQUx2XodXMam44BuyAhLcJeXsdQMwToU"
],
"AgentVersion": "go-ipfs/0.4.13/3b16b74",
"ProtocolVersion": "ipfs/0.1.0"
}
/ # ipfs version --all
go-ipfs version: 0.4.13-3b16b74
Repo version: 6
System version: amd64/linux
Golang version: go1.9.2
B was then connected to A:
/ # ipfs swarm connect /ip4/1.2.3.4/tcp/4001/ipfs/QmSscuC3tyLSCjWDviuff422gb1mmsp1yo1sbt6KrSwFdH
connect QmSscuC3tyLSCjWDviuff422gb1mmsp1yo1sbt6KrSwFdH success
Confirming on B, it is in fact connected to A:
/ # ipfs swarm peers | grep 1.2.3.4
/ip4/1.2.3.4/tcp/4001/ipfs/QmSscuC3tyLSCjWDviuff422gb1mmsp1yo1sbt6KrSwFdH
Port appears open from B to A:
/ # nc 1.2.3.4 4001
/multistream/1.0.0
And from A to B:
/ # nc 4.3.2.1 4001
/multistream/1.0.0
A has the pinned file:
/ # ipfs pin ls | grep QmNLerXsk1kp98tDbnArAchg4jwwvw2cLoHMwZteV8NXc5
QmNLerXsk1kp98tDbnArAchg4jwwvw2cLoHMwZteV8NXc5 recursive
When trying to fetch the file on Node B, the command hangs indefinetely:
/ # ipfs pin add QmNLerXsk1kp98tDbnArAchg4jwwvw2cLoHMwZteV8NXc5
These are the only relevant log messages, but they don’t seem to necessarilly correlate with the start of the hangs.
00:04:24.601 ERROR bitswap: couldnt open sender again after SendMsg(<peer.ID SscuC3>) failed: session shutdown wantmanager.go:237
00:23:07.123 ERROR bitswap: couldnt open sender again after SendMsg(<peer.ID SscuC3>) failed: session shutdown wantmanager.go:237
00:30:29.622 ERROR bitswap: couldnt open sender again after SendMsg(<peer.ID SscuC3>) failed: session shutdown wantmanager.go:237
01:18:30.184 ERROR bitswap: couldnt open sender again after SendMsg(<peer.ID SscuC3>) failed: session shutdown wantmanager.go:237
I’m not really sure where to go further to troubleshoot this. Maybe there’s a way to get more debug output? Would a GitHub issue be more appropriate? Any suggestions?