Pubsub nodes not connecting (js-libp2p-webrtc-star)

:wave:

What could be the issue if a node doesn’t receive a pubsub message?

Notes:

  • Confirmed that both nodes are subscribed to the same topic, and connected to the same webrtc-star signaling server.
  • The nodes do appear in their respective swarm.addrs lists, but they aren’t in their swarm.peers lists (on both sides).
  • Does work if both nodes are in the same browser.
  • Does NOT work if done cross browser (same device) or cross device.
  • Signaling server is hosted on Heroku and uses HTTPS

Any ideas?
Thanks.

1 Like

I tried to do a circuit relay, and that doesn’t work 95% of the time either.
Connecting to the relay works, but dialing the two browser nodes does not.
I’m following the https://github.com/ipfs/js-ipfs/tree/master/examples/circuit-relaying example.

Hey folks,
Can you let me know the versions of js-ipfs / js-libp2p you are running?
Also, we just released a few hours ago some pubsub fixes, so this should probably be fixed. If you are using latest versions, you should be able to just reinstall everything: https://github.com/libp2p/js-libp2p-pubsub/pull/57

Hey, sure thing! I’m using js-ipfs v0.47.1-rc.7 (I had the issues with 0.47 as well)
I’m now using js-libp2p from master with your changes.

Config looks like this:

ipfs = await Ipfs.create({
    config: {
      Bootstrap: [
        RELAY
      ],
      Addresses: {
        Swarm: [
          SIGNALING_ADDR
        ]
      }
    },
    relay: {
      enabled: true,
      hop: {
        enabled: true
      }
    },
    init: {
      repo: "ipfs-" + Date.now(),
      repoAutoMigrate: true
    },
    libp2p: (opts) => {
      return new Libp2p(opts.libp2pOptions)
    }
  })

Does that look ok?


That said, sadly I’m still having these issues.

Hey!
I don’t recommend you to use an rc when there is a final release.
Did you try updating to ipfs@0.48? It has the latest libp2p with some fixes for pubsub that we did. We have tested https://github.com/libp2p/js-libp2p-examples/tree/master/chat and everything seems to be working. Maybe you can use this example as a starting point?

Meanwhile, I backported these fixes to the older libp2p-pubsub to try to help with this: https://github.com/libp2p/js-libp2p-pubsub/releases/tag/v0.4.7
You can reinstall and run npm ls libp2p-pubsub to check if you are running it. Anyway, I highly recommend to move to ipfs@0.48 as it will be running all the latest releases for libp2p.

If this does not help, it would be helpful if you could get me an easy way to debug this. An online editor and the instructions, or a repo

Thanks for the fast response!

  1. Good to know about the rc, I only switched to it because the stable version wasn’t working for me.
  2. Is ipfs@0.48 released yet, I don’t see it on Github or NPM?
  3. I hadn’t seen that specific example yet, thanks! I’ll check that out.

I’ll do some more debugging/testing and I’ll get back to you.

Hey @icidasset

Sorry for the confusion, was still quite early when I wrote that. I was meaning the final 0.47 release. The final releases should be an iteration from the rc with tests and integrations in other projects. This way, I was meaning the 0.47 that you tried before. So, I would say to guarantee that you are running the latest libp2p-pubsub (0.5.4). The pubsub should be working as expect, but I suggest that you follow the example I linked for fully understanding all the flows behind the scenes.

@vasco-santos Oh ok, gotcha :smile:

I got it working for the most part now, except that it doesn’t work with Safari.
Made a quick demo page at https://petite-scarlet-plastic-bear.fission.app/
See page source for setup.

If I open that in Firefox and Chrome, the nodes connect just fine.
Now for some reason it doesn’t work when I try to use Safari :man_shrugging:
That is, if I have one node in Firefox and the other in Safari, the nodes don’t connect.
I’m using Safari v13.1.1 (latest I think) on Mac OS 10.14.6 (not Catalina)

Additional notes:

  • I realise the demo page uses a precompiled 0.47 version of js-ipfs, most likely not including your libp2p-pubsub changes. THAT SAID, I have made a local build with your changes and it’s the same thing.

  • If I try to connect manually, in the dev console, to ${RELAY_ADDRESS}/p2p-circuit/p2p/${PEER_ID_SAFARI} it does sometimes connect, and is then able to deliver the pubsub message.

  • Other times with that manual connect it fails with various errors. Which include:

    • The error posted above (hop request failed with code 260)
    • Aggregate error stream ended before 1 bytes became available
    • Aggregate error The operation was aborted
  • On the Safari node’s end I sometimes get the following error:

    WebSocket network error: The operation couldn’t be completed. Broken pipe

    And also:

    WebSocket connection to ‘wss://RELAY_ADDRESS_OMITTED’ failed: WebSocket is closed before the connection is established.

Any ideas?
Thanks for your hard work on all of this :v:

I tried your deployed example with 2 safari windows + 1 chrome window and I could get connected to each other (Safari Version 13.0.5 (15608.5.11)). Did you change anything meanwhile?

I realise the demo page uses a precompiled 0.47 version of js-ipfs, most likely not including your libp2p-pubsub changes. THAT SAID, I have made a local build with your changes and it’s the same thing.

You mean the example?

If I try to connect manually, in the dev console, to ${RELAY_ADDRESS}/p2p-circuit/p2p/${PEER_ID_SAFARI} it does sometimes connect, and is then able to deliver the pubsub message.

Can you replicate it via code? If you could provide me a code sample in jsfiddle or similar would be helpful.

Uh, what the … it’s working for me now as well, and I didn’t change anything :man_shrugging:
Maybe something to do with the signaling server?
To be really specific, it didn’t work the first time with Safari + Chrome, but since I opened 3 different browsers it’s been working fine.

You mean the example?

Yeah sorry, meant the example.

Can you replicate it via code?

Weirdly enough, I couldn’t connect with swarm.connect in my code (I wrote something with setInterval and try/catch), but it did work sometimes in the dev console.


So confused right now :sweat_smile:
It seems it’s working better today for some reason.
I’ll get back to you in case it goes bananas again.

Thanks for the help! :pray:

That is strange indeed. Let me know if you are able to replicate the issue again.

If you were using the “preload” nodes, you may have experienced issues while we had an incorrect config change deployed, see JS-IPFS website not working

Good to know thanks :+1:
Can I do the following to avoid that?

Ipfs.create({ preload: { enabled: false } })

Or does that no longer work?

Experiencing another issue right now :sweat_smile: Only happens sometimes though. What if the nodes are connected (ie. they appear in their respective swarm.peers lists), but they don’t appear in the pubsub.peers lists (ie. they don’t receive messages)?

The libp2p flow for establishing a connection and open a stream is as follows:

  • peerA discovers peerB (will dial it, if the autoDial option is enabled, which is the default!)
  • peerA dials peerB and negotiate their security and multiplex protocols
  • once the connection is established, the identify protocol will kick in. The identify protocol will make peers exchange their listen multiaddrs and running protocols.
  • if peers are running the pubsub protocol, the onConnect handler for the pubsub protocol is called. It will open a pubsub stream to receive messages: https://github.com/libp2p/js-libp2p-pubsub/blob/master/src/index.js#L186-L212 This handler will add the peer data (with the stream), to the pubsub.peers
  • When a peer gets disconnected, the pubsub onDisconnect handler is called. Its stream is closed and it is removed from the pubsub.peers

Now that the flow is clear, it seems that the onDiscconect was called without the connection closed. Did you do a refresh in the webpage? If not, I would to see the code to understand what might be the issue, or you would need to log the onDisconnect handler to try to figure out why it is being called.

1 Like

Thanks for explaining that flow!

I was finally able to replicate the issue consistently.

  1. If I close my browser windows and open the pages, it works as expected.
  2. If I then refresh either of those pages, or both, without closing them, it does no longer work.

See video :point_down:

Website from the video:
https://auth.fission.codes

I can’t send you the code for this, yet, but I’ll DM you on the Fission Discord when I can open-source this. I’ll try to debug a bit more based on that onConnect and onDisconnect code you’ve explained.

Can you please do npm ls libp2p-pubsub in the project root? and confirm me the js-ipfs release so that I can try to replicate myself?
Because we fixed the refresh bug a couple of days ago.

2 Likes

Yup, that fixed it! :tada:
I guess it didn’t came through last time because I didn’t close the windows first…

Thanks so much for all the help :pray:

1 Like