ParseSwift SDK: Observe LiveQuery WebSocket status

I’ve looked further into ParseLiveQuery and fixed a bug where a web socket task was being reused after it was closed. I also addressed some of the error handling you mentioned. The fixes are in the PR I mentioned earlier.

You can try out the PR and test out the updated playgrounds:

I can confirm that with the fix I am able to reset LiveQuery successfully and also in the cloud code info log the messages appear immediately now (not sure if Back4App has done anything on their side as I got no feedback yet).

First I tried only to .unsubscribe() and .subscribe() after a broken connection (cloud code upload, server reset,…) but that was not successful as it gives the error for both functions:

Error Domain=NSPOSIXErrorDomain Code=57 “Socket is not connected” UserInfo={NSErrorFailingURLStringKey=https://felse.b4a.io/, NSErrorFailingURLKey=https://felse.b4a.io/}

Just using .unsubscribe() and .subscribe() works well before the connection is broken. So I adopted the .closeAll() function:

ParseLiveQuery.getDefault().closeAll()

Here I noticed are 3 ways to do it:

#1

  1. .unsubscribe()
  2. .closeAll()
  3. var subscription = query.subscribeCallback! with setting completion handlers

#2

  1. .closeAll()
  2. var subscription = query.subscribeCallback! !without! setting completion handlers, otherwise it doubles the event handlers

#3 ← I ended up using this and it works perfectly

  1. ParseLiveQuery.getDefault().closeAll()
  2. ParseLiveQuery.getDefault().open(completion: { error in … })

What would be the correct way from the server side? In the cloud code info log I see that it disconnects even using the second way (not calling .unsubscribe()):

2021-06-26T11:28:31.866Z - Client disconnect: ac12dfda-d1ad-4999-a08d-6effaf448a05

As the connection is already broken at that point I believe there is no advantage of calling .unsubscribe() on the server side, right? But can there be any zombie subscriptions in the LiveQuery server hanging?

Thank you for solving this issue! I believe, now there is a robust way to reset LQ connection without restarting the client app!

I don’t know much about how live query works on the server side to answer what you should do there. My guess is that once a connection is closed the server will discard the subscriptions, but @davimacedo might have more info here.

I’ll point out that now on the client side, after you unsubscribe from all of your subscriptions, it will automatically close the connection.

The second and third way you mentioned seems reasonable to me. It just matters your scenario. If you are subscribing to a new query in scenario 2, it should reconnect and also resubscribe to any previous queries as well. Scenario 3 should reconnect and resubscribe to all previous queries.

Note that you can also use: ParseLiveQuery.client?.open, ParseLiveQuery.client?.openPublisher, ParseLiveQuery.client?.close(), ParseLiveQuery.client?.closeAll()

1 Like

Thank you for clarification. In my case is the #3 the most elegant and working after broken connection (or also with live connection eventually).

As you mentioned I noticed, that .unsubscribe() is closing the connection (if not broken before), what I can confirm in Xcode and via cloud code info log. Just out of curiosity, should not the URLSessionWebSocketDelegate call didCloseWith at that moment? I see in the debug console a line:

2021-06-26 19:20:25.070351+0200 Felse[6674:997878] [websocket] Read completed with an error Operation canceled

But when I put breakpoint in the following lines, the function is not getting called. What I would expect (with my limited knowledge)…

     func urlSession(_ session: URLSession,
                    webSocketTask: URLSessionWebSocketTask,
                    didCloseWith closeCode: URLSessionWebSocketTask.CloseCode,
                    reason: Data?) {
        self.delegates.forEach { (_, value) -> Void in
            value.status(.closed)
        }
    }

When you unsubscribe from all subscriptions or call close or closeAll the socket is being closed from the client side (see Apple documentation for cancel()), not the server side. I’ll defer to the Apple documentation for the delegate, but my interpretation is the delegate method gets called when the server requests to close the connection which will then send a close frame to tell the client to gracefully close:

1 Like

I see, thank you for the patience and great clarification!

1 Like

In addition you can also receive connection metrics on the client side and make decisions from there. You can do that by becoming a receiveDelegate:

http://parseplatform.org/Parse-Swift/api/Classes/ParseLiveQuery.html#/s:10ParseSwift0A9LiveQueryC15receiveDelegateAA0acdF0_pSgvp

And then using received(_ metrics: URLSessionTaskTransactionMetrics)

1 Like

with this snippet:

let client = ParseLiveQuery.getDefault()
client?.receiveDelegate = self

I tried the ParseLiveQueryDelegate as you proposed and it indeed notify that there was a disconnection:

func received(_ error: ParseError) {
    print("received: \(error)")
}

This is received 2x no matter if I subscribe to 1 or 2 query:

received: ParseError code=-1 error=The operation couldn’t be completed. Socket is not connected

Digging a bit deeper in the LiveQuerySocket function…

…I see that the line 97 calls the receive(task) again, what is a bit confusing for me. Perhaps you could enlighten a bit on that. But as it is not related to disconnection I went further on line 103, 104. I did print the error before it gets translated to ParseError (line 103):

Error Domain=NSPOSIXErrorDomain Code=57 “Socket is not connected” UserInfo={NSErrorFailingURLStringKey=https://felse.b4a.io/, NSErrorFailingURLKey=https://felse.b4a.io/}

So the error code that should trigger query reset is 57 “Socket is not connected” and at that point the ParseLiveQuery should close itself (what it doesn’t currently). We can see that the line 104 pass it to the ParseLiveQuery line 473 and then to the receiveDelegate.

Why it gets called 2x I could not understand even with putting a lot of breakpoints. Let’s assume I would ignore the second call with some Bool frag I would like to implement reconnecting feature in the receive delegate. Here I noticed:

  1. that all functions are mandatory, what makes the delegate look like this:
extension ParseService: ParseLiveQueryDelegate {
    
    func received(_ challenge: URLAuthenticationChallenge, completionHandler: @escaping (URLSession.AuthChallengeDisposition, URLCredential?) -> Void) {
        
    }
    
    func received(_ error: ParseError) {
        print("received: \(error)")
    }
    
    func receivedUnsupported(_ data: Data?, socketMessage: URLSessionWebSocketTask.Message?) {
        
    }

    func received(_ metrics: URLSessionTaskTransactionMetrics) {
        
    }

    func closedSocket(_ code: URLSessionWebSocketTask.CloseCode?, reason: Data?) {
        
    }
        
}

Do you think that the functions could be made optional, or it would break some logic?

  1. the func received(_ error: ParseError) handles already translates ParseError and therefore hides the webSocket error code 57. As the ParseError does not have yet the Code 57 I think that it could be added and then passed through this function. Or on the other login on, the function could pass original Error instead of ParseError.

  2. Or maybe the SDK itself could react on the code 57 and try to reconnect, so that the client would not need to implement receiveDelegate to handle this?

What do you think? Thank you!

This is a requirement of URLSessionWebSocketTask. Apple has a video describing how URLSessionWebSocketTask works and there’s a blog that discusses:

This was already implemented, the extension just wasn’t public. The PR below makes it public.

You can test out the branch to see if it works: LiveQuery socket should always continue receiving by cbaker6 · Pull Request #204 · parse-community/Parse-Swift · GitHub

Great! Thank you for clarification!

A I am testing it out, it is calling the self.open(isUserWantsToConnect: false) { _ in } on line 489

But I believe that passing parameter isUserWantsToConnect: false does not set isConnected = false and therefore it stays true (as my debug prints show).

And step by step with breakpoints revealed that the open(isUserWantsToConnect:) returns on the line 540

Calling self.open(isUserWantsToConnect: true) { _ in } feel incorrect, so what if the isConnected would be set to false right before calling self.open(isUserWantsToConnect: false) { _ in }?

func receivedError(_ error: Error) {
        guard let posixError = error as? POSIXError else {
            notificationQueue.async {
                self.receiveDelegate?.received(error)
            }
            return
        }
        if posixError.code == .ENOTCONN {
            if attempts + 1 >= ParseLiveQueryConstants.maxConnectionAttempts + 1 {
                let parseError = ParseError(code: .unknownError,
                                            message: """
Max attempts (\(ParseLiveQueryConstants.maxConnectionAttempts) reached.
Not attempting to connect to LiveQuery server anymore.
""")
                self.receiveDelegate?.received(parseError)
            }
            self.isConnected = false   //<---- setting is connected to false here
            self.open(isUserWantsToConnect: false) { _ in }
        } else {
            notificationQueue.async {
                self.receiveDelegate?.received(error)
            }
        }
    }

One another state that could be covered is the failed self.open(isUserWantsToConnect: false) { _ in } try. Since it has empty completion block and the open(isUserWantsToConnect:) can fail with error, returning that error to empty completion block would not inform receiveDelegate that the reconnection failed. But perhaps this is covered by the status(_ status: LiveQuerySocket.Status, closeCode: URLSessionWebSocketTask.CloseCode?, reason: Data?) protocol function, I will have a look how does that behave…

Let me know how the PR below works:

This recovers the connections successfully. It just don’t work for the case when the server still did not boot up. As in my example when I upload new cloud code, the server hard disconnects and the resumeTask() tries to open the connection again, but I guess that the server is not yet ready (or client?). So in this example, it works only if I hold the execution for few seconds with a breakpoint, giving the server some time.

func resumeTask() {
        synchronizationQueue.sync {
            switch self.task.state {
            case .suspended:
                isSocketEstablished = false
                task.resume()
                URLSession.liveQuery.receive(self.task)
                URLSession.liveQuery.delegates[self.task] = self
            case .completed, .canceling:
                URLSession.liveQuery.delegates.removeValue(forKey: self.task)
                isSocketEstablished = false 
/* -----> */ task = URLSession.liveQuery.createTask(self.url) //<---- Breakpoint 5-10 seconds
                task.resume()
                URLSession.liveQuery.receive(self.task)
                URLSession.liveQuery.delegates[self.task] = self
            case .running:
                isConnected = false
                isSocketEstablished = true
                open(isUserWantsToConnect: false) { _ in }
            @unknown default:
                break
            }
        }
    }

With the help of that vbearkpoint I see in the debug:

Successfully subscribed to new query Inbox ({“limit”:100,“skip”:0,"_method":“GET”,“where”:{“rid”:“3xwiNx3zsU”}})
Successfully subscribed to new query Group ({“limit”:100,“skip”:0,"_method":“GET”,“where”:{“objectId”:{"$in":[“08BWZVHzES”]}}})
2021-08-01 18:24:52.717398+0200 Felse[20203:1835312] Connection 3: missing error, so heuristics synthesized error(1:53)
2021-08-01 18:24:52.717694+0200 Felse[20203:1835312] Connection 3: encountered error(1:53)
Successfully subscribed to new query Group ({“limit”:100,“skip”:0,"_method":“GET”,“where”:{“objectId”:{"$in":[“08BWZVHzES”]}}})
Successfully subscribed to new query Inbox ({“limit”:100,“skip”:0,"_method":“GET”,“where”:{“rid”:“3xwiNx3zsU”}})

But without the break point it does not recover.

The latest commit may help as it should add some delay before attempting to reconnect.

If the problem still is there, I recommend using the delegates to handle your custom situations. If you see a place in the SDK to improve, feel free to submit a PR.

Ah, great, thank you for implementing the delay there. Unfortunately the reconnection interval is too short and it does not help in this case.

I had a look and already the first breakpoint shown that the number of attempts was 4 so I did put the calculation of reconnection interval into the playgrounds and found out that it mostly generates 0 seconds:

for _ in 1...5 {
    //lets try 5x the attempts count 1-5...
    var intervals: [Int] = []
    for i in 1..<5 {
        let min = NSDecimalNumber(decimal: Swift.min(30, pow(2, i) - 1))
        intervals.append(Int.random(in: 0 ..< Int(truncating: min)))
    }
    print(intervals)
}

It seems that this random Int case the reconnectionInterval to often not wait as it gives back 0 seconds:

[0, 1, 2, 14]
[0, 0, 4, 3]
[0, 1, 1, 4]
[0, 0, 0, 13]
[0, 2, 6, 14]

When there is no Int.random(in: 0 ..< Int(truncating: min)) but only Int(truncating: min) it gives back seemingly more reasonable intervals.

[1, 3, 7, 15]
[1, 3, 7, 15]
[1, 3, 7, 15]
[1, 3, 7, 15]
[1, 3, 7, 15]

What is the idea behind random integer there? When I tried to understand the behaviour I noticed that even during the app launch the resumeTask() is getting triggered many times and mainly with a random reconnectionInterval. So I did a fork and tested it with the second and I can confirm that it reconnects successfully when I did upload a new cloud code. I did submit my first PR ever, so let me know if I should adjust anything

For reference for when others see this, linking to your last comment on Github where you mention this is solved: Removing random Int in the reconnection interval of ParseLiveQuery and added warning in playgrounds by lsmilek1 · Pull Request #208 · parse-community/Parse-Swift · GitHub

I am afraid I have to come back to this never ending topic. After further working with live query and cloud code I noticed that the LQ reconnects:

  • always when back4app container goes to sleep. It wakes it up again and reconnects
  • only some times (actually almost never) when I upload cloud code and the container restarts.

Implementing receiveDelegate I can see that the delegate receives only one error that seems to have consistent 4 time(s) count

ParseError code=-1 error=ParseLiveQuery Error: attempted to open socket 4 time(s)

And as previously mentioned, if I use breakpoint and hold the execution a bit, it reconnects successfully always. So I still suspect that there is some code race removing the task delegate and quitting the reconnection look. Shouldn’t the the completion on line 587 be called inside completion block of the self.resumeTask { _ in } on line 583?

so that the loop waits on potential task creation (236-241)? I’m not sure about the default break though.

Other improvement could be to not set reconnection interval to 0, but that is a bit hacky as we discussed.

Can you try out:

Update: SDK version 1.9.4 should address the reconnection issues:

Great, the receiveDelegate now shows multiple messages and the reconnection is successful!

2021-08-06 08:59:35.369261+0200 Felse[39963:3872571] Connection 4: missing error, so heuristics synthesized error(1:53)

2021-08-06 08:59:35.369578+0200 Felse[39963:3872571] Connection 4: encountered error(1:53)

received: ParseError code=-1 error=ParseLiveQuery Error: attempted to open socket 3 time(s)

received: Error Domain=NSURLErrorDomain Code=-1011 “There was a bad response from the server.” UserInfo={NSErrorFailingURLStringKey=https://felse.b4a.io/, NSErrorFailingURLKey=https://felse.b4a.io/, _NSURLErrorWebSocketHandshakeFailureReasonKey=140703128616960, NSLocalizedDescription=There was a bad response from the server.}

received: Error Domain=NSURLErrorDomain Code=-1011 “There was a bad response from the server.” UserInfo={NSErrorFailingURLStringKey=https://felse.b4a.io/, NSErrorFailingURLKey=https://felse.b4a.io/, _NSURLErrorWebSocketHandshakeFailureReasonKey=140703128616960, NSLocalizedDescription=There was a bad response from the server.}

Successfully subscribed to new query Inbox ({“limit”:100,“skip”:0,"_method":“GET”,“where”:{“rid”:“yxnjadgZol”}})

Successfully subscribed to new query Group ({“limit”:100,“skip”:0,"_method":“GET”,“where”:{“objectId”:{"$in":[“0IeDlCF11M”]}}})