Lbry-As-A-Container


#1

Scope of this post.

I would love to hear feedback on what sections of LBRY’s rich ecosystem should be developed and expanded on with docker first. See who would use what within the community. Then I can maybe focus on that first. Personally, I’ve thought Spee.ch would be the most impactful but Chainquery would be a pretty good performance booster if we could encourage development backed by it.

My goals

I plan on writing a deployable stateless docker best practices container cluster to the best of my ability. Ideally, this will have a hand full of containers supporting the various components to get a staging environment or even production environment of LBRY and other components running in under a few minutes without having to read anything. The idea is to help enable people through official channels, to boot up an environment to kick the tires and test drive the tech. Some people want to get to know the feel of how something runs in the ideal environment before getting deep down into the mechanics. Because what good an app if it runs like shit and all you have to go on is buying the hype.

Let’s prove this product to the greater community! With Spee.ch we’re already making huge strides to being insanely useful.

What do I mean by Stateless

Stateless means a container wrapping an application in which you can deploy like a lego block, you don’t have to care about what’s inside of the container and by default, it should just work. However, you should be able to configure any if not all of the configs for the appliance inside. The other goal I will have personally with a configuration mechanism will be to not break any future changes to configs as best as possible while keeping the maintainability fairly trivial.

Maintainability

Anyone should be able to glance at what’s happening in the code and contribute a tweak or addition. If you think that I’m doing something in a way that is less than optimal feel free to reach out to me and I’m more than open to learning new and interesting concepts.


#2

Chainquery requires a lot of state. Like 5 servings of it lol. The database that backs it is the entire blockchain expanded from compressed data that makes it easily searchable. Also the population of the data takes more than a few days with a standard desktop. So it would be difficult to get it up and running quickly.

The important parts of Chainquery are the SQL API and the data itself. The applications only purpose is to serve these requests and to populate this database.

I agree Chainquery is a big performance boost, it has helped us significantly at LBRY to deliver some really great functionality more easily and faster. However, it might not be a great first candidate for docker.


#3

Haha, good point on the blockchain sync side of it. However, the database itself isn’t what needs to be “stateless” its the application. The goal is to make it easy to get the end user deployed from a pre-compiled binary or a compiled from the source code container. For the tin foil hatters and the potential contributors.

Is there no way to obtain a checkpoint of the database any faster? Like couldn’t we host checkpoint datasets on BitTorrent or something?

The goal with shipping via a Container aside from the security implications is to have a standard method of shipping the entire platform bits at a time if need be.

So far the structure looks something like this:

  • ./spee.ch/
    • Dockerfile
    • compile/
      • Dockerfile
  • lbrynet-daemon/
    • Dockerfile
    • compile/
      • Dockerfile
  • chainquery/
    • Dockerfile
    • compile/
      • Dockerfile

Level1 directories in the repository is for the whole scope of a single LBRY appliance.
Everything inside should contain everything you’d need to get a container cluster off the ground and a fully functional version of whatever that component is.
Level2 Directories within a Level1 directory will contain more specific container cases such as a portable compiler container.
The end user should just be able to pull a copy of the LBRY-${appliance} source code and then if there is Docker installed on their host just run the Docker command and out comes a binary or up comes a testing node.
This isn’t the only scope for Level2 but at the moment these are the two use cases I can see for the addition of Docker.


#4

I actually like this idea and I have thought about it in the past. The data in the database is not going to change. We should be able to offer checkpoint downloads that make things much easier, for example every 10K blocks. I know SQL Server does this sort of delta backup type thing. We are hosting tons of data already, so I don’t think hosting this is a problem at all and would be a drop in the bucket.

I would need to investigate what MySQL offers along these lines. A DB backup of Chainquery takes around 3 hours to restore. So that would lessen the impact significantly.


#5

Lenny, I created https://github.com/lbryio/chainquery/issues/50 as a result of this post.


#6

We’re going to be using an LBRY provided Amazon S3 bucket for this. I may GPG sign it as well.