Taking Parse Global

#1

We’ve build a really successful app on Parse and now need to distribute our database in order to ensure latency is within acceptable limits. Our users are mainly based in Australia and EU with a growing number joining from North America.

I’m sure most people are hosting their mongo database with a service such as Atlas, mLab (now Atlas BTW), compose.io etc etc. However, sharding to allow low latency writes (as described here) is ridiculously expensive, starting at $2.50/hr.
This price rules out a service such as Atlas.

Other options
I believe that sashido are able to offer their service with Mongo DB replica sets distributed around the world. The last quote I had from them was coming in ~$600/mo.

One other option I’ve looked at is Asure Cosmos, a globally distributed API for MongoDB. I am unsure if this can be used as a core database for Parse Server or if it uses an API layer or something? I note that the Bitnami, parse server within the Azure Marketplace uses Cosmos, suggesting that it can be used with Parse Server? There is an adaptor for parse server on GitHub and some instructions however I have also heard of issues with this and the costs getting out of control.

Questions

  1. Does anyone have experience with using Cosmos in order to attain Global Distribution.
  2. Are there any issues with delayed writes / replica collisions?
  3. What other options are there for ‘going global’?

@flovilmart Many thanks for the community, your hard work on the tools and allowing me to launch a global SaaS based on Parse.

1 Like
#2

Hello,

As far I’m aware you can use the Cosmos right now with the Parse Server because it’s compatible with the MongoDB protocol. Here is an example how to migrate from Mongo to Cosmos.

Regarding the performance of your application consider host your entire parse server infrastructure in Azure as well.

I hope this information is helpful.

p.s: Regarding the @sashido pricing, it’s impossible to be $600/mo for Global MongoDB cluster in 2 or 3 continents :smile:

1 Like
#3

I do not believe sharding is the solution for you.

In terms of architecture, you could very well run three applications in three clusters / data centers. Each user would be able to pick their closest datacenter. As you seem to be building a SaaS service, this could be reasonable.

Then also to consider, read vs write loads. If you are looking for fast reads across the globe, then you can leverage read replicas in strategic locations. Each parse-server deployed on the ‘edges’ could be assigned to the readonly nodes by setting the read preference.

So far what did you envision part of your architecture and what are your constraints / challenges in terms or read/write performance.

#4

Hi @flovilmart,

I’m kind of pleased to hear that sharding might not be the solution, however fear that we’re already too far down the road to be looking at separated databases in each region so we can offer one close to the user. We already have ~15000 users that are spread around the world in our single database, currently hosted in the EU.

I’m interested to hear about read-replicas, indeed I think this is actually what Atlas are offering, rather than full read/write capability. This still comes in at a ridiculously high price point out of reach of us. In answer to your question I expect that we are reading much more than writing to the tune of about 4:1.

A quick configuration check on Atlas I note that you can add read only replica sets a very little increase in price. We are currently running an M10, which is $0.09/hr. Adding a read only in eu-east costs $0.02/hr and adding AP-Southeast costs another $0.04. So we can add 1 node in US-East and 1 Node in AP-Southeast for a price increase of $0.06 ($0.15/hr total). Is 1 node considered enough? With fail over falling back to a slower read time? This would be a great compromise if so.

Our main constraints / challenges in terms of read/write performance is that within the EU we are able to get API calls performed ~150ms however this is around 500 - 650ms when calling from Australia.

I’ve tried hosting an App Server (cloud code) close to the users so the long distance call is between AWS infrastructure in a hope this would be faster (i.e. Elastic Beanstalk server in Singapore -> AWS MongoDB Server in London). There is little difference between this and the App Server and MongoDB being in close to each other (Elastic Beanstalk Server in London -> MongoDB Server in London).

I guess we could make the decision to have different databases in each region and migrate current users to their local server, however this is complexity I would rather avoid.

Any ideas ideas / suggestions area greatly appreciated.

#5

I believe it’s worth trying the read replicas for now, as the cost is contained. Writes will always be problematic with a single common database for all your services, but as you mention your workload is mostly reads, you can get away with it.

Your 150ms response times look quite high. Are all your queries optimized and properly indexed?

#6

To be honest I am getting closer to 50 - 90ms in the UK.

Time for one last noddy question: With a read replica, how quick is it updated? So I perform a write which is written to the primary, is there then a write to the read only replica within ms?

Queries are written per the Parse Docs. Is there any docs on how to properly index ? I don’t run any client side libraries, purely use the rest API and a rather large cloud app.

#7

With a read replica, how quick is it updated? So I perform a write which is written to the primary, is there then a write to the read only replica within ms?

This depends on a variety of factors and ultimately you can control with the read concern parameters https://docs.mongodb.com/manual/replication/

Is there any docs on how to properly index

Not on parse directly, but on mongodb with the profiler https://docs.mongodb.com/manual/tutorial/manage-the-database-profiler/. Atlas has a performance adviser that can help you with the slow queries.

#8

Thanks @flovilmart.

Atlas support were helpful and provided the following info and resources if anyone else is interested:

To ensure applications can read their own writes in a distributed environment, MongoDB 3.6 introduced the concept of causal consistency. Our development team explains more about this feature in the Causal guarantees are anything but casual blog post. As far as latencies, the network round trip time between Ireland and Sydney is roughly 300ms, providing a lower bound for replicated writes.

With regards to slow queries this can be viewed simply by navigating to Metrics > Performance Advisor in the Atlas admin dashboard.

#9

Awesome! 300ms would be fine I believe for the replication and your use case.

DId you find any queries that required optimizations?

#10

I was about to say not, then I went back to take a screen shot and then there were some :roll_eyes:.

And gave me the following advice:

An index for this query shape will improve the efficiency of read operations. To create an index, use db.collection.createIndex() in the mongodb shell or a similar method from your driver.

db = db.getSiblingDB("database-name")
db.getCollection("ClassName").createIndex({
"shareKey": 1 }, {background: true})

Can we safely do such things whilst running Parse Server?

#11

This should be good, indexes are built in background, thus not affecting your ability to serve data.

#12

I implemented this index and it worked great.

For others info, the MongoDb Compass GUI tool allows indexes to be created in a really simple way.

#13

Hi @flovilmart,

Apologies if I’m asking the question on wrong place, but wanted to know how do we configure the parse-server with multiple read replicas.

Thanks.

#14

Technically you start the server with the read replica URL instead of the full DB URL.