A decentralized AI compute network — run LLM inference through an OpenAI-compatible API, rent on-demand NVIDIA GPUs by the minute, or earn by sharing your own.

ParalonCloud is a decentralized AI compute network. Instead of one company owning the hardware, independent providers around the world share their NVIDIA GPUs — and that pooled compute powers everything from a one-line model API to full GPU rentals. No long-term commitments; you pay only for what you use.

There are three ways to use the network.

Run inference

Call open models through an OpenAI-compatible API — point your existing OpenAI SDK at ParalonCloud and change two lines. Your requests are served on distributed GPU nodes.

from openai import OpenAI

client = OpenAI(
    api_key="prlc_your_key_here",
    base_url="https://paraloncloud.com/v1",
)

Playground — try the models in your browser, no code.
Console — create API keys (prlc_…) and track usage.

Rent compute

Need a whole machine instead of an API? Spin up a GPU in your browser in under a minute — a full Jupyter Lab environment with Python, a terminal, and your favorite ML libraries, on real NVIDIA hardware, billed per minute.

Rent a GPU — step-by-step, from browsing to a live notebook

Provide compute

Have an idle GPU? Put it to work and earn. Provider nodes power both inference and rentals on the network, and you stay in control — accept work when you want, stop anytime.

Prerequisites — hardware and software your machine needs
Add a Node — connect your machine in a few steps

How it works

Every workload runs on a provider's machine, reachable through a secure tunnel — no port forwarding, no public IP required. Inference is routed to a node that can serve the model; rentals run in an isolated container for the duration of your session. You pay only for what you use, and your data lives on the instance only as long as the work does.

ParalonCloud Documentation

Run inference

Rent compute

Provide compute

How it works