github – Ash Tewari

Giving OpenClaw Secure Access to Cloud Services Without Sharing Your Password

Device Code Flow has been built into OAuth2 for years, originally designed for TVs and game consoles. It works just as well for a Docker container. It requires no credentials stored on the agent machine. It gives you narrow, revocable access that the agent cannot exceed.

I have OpenClaw deployed in a sandboxed docker container on a dedicated host. I communicate with it via a Telegram bot, in a locked-down chat.

I wired the device-code authentication pattern up with OpenClaw and Microsoft (Personal/Consumer) services, but the authentication approach works with any app and any command-line tool you want to run in a headless environment, including Custom APIs. Device code flow + Entra ID app registrations give you a zero-stored-credentials gateway to any HTTP API you can build, not just Microsoft Graph.

What Is Device Code Flow?

OAuth2 device code flow (RFC 8628) was designed for “input-constrained devices” — things without a keyboard or a browser. Think of how you sign in to Netflix on a smart TV: a short code appears on screen, you visit a URL on your phone, you type the code in, and the TV logs in. You never type your password on the TV.

The official name for that pattern is the device authorization grant. A Docker container is, from the protocol’s perspective, exactly the same kind of device. It has no browser. It cannot perform an interactive redirect. But it can make HTTP requests, and that is all it needs.

The flow has three steps:

1. The app posts to the identity provider’s device code endpoint. It gets back a short user code (like “WDJB-MJHT”), a verification URL, and a polling device code.

2. The user visits the URL on any browser, on any device — their phone, laptop, anything — and enters the short code. They sign in normally, with their usual credentials and MFA if it is enabled.

3. The app polls the token endpoint in the background. Once the user finishes signing in, the poll returns a real access token and a refresh token. The app stores these and uses them for API calls going forward.

The app never sees the password. The password goes directly from the user’s browser to the identity provider. The app only ever handles two things: a short temporary code that it sends to the user, and a token that the identity provider gives back once the user has authenticated.

OpenClaw running in a Docker container fits this model exactly.

Which Services Support this?

Device code flow is not a Microsoft-only feature. It is a standard OAuth2 extension (RFC 8628) and most major identity providers support it — including Microsoft, Google, GitHub, and AWS. If a service uses Okta or Auth0 for identity, those support it too.

The Security Architecture

The agent never sees your credentials. Your password is typed in your browser, on your device, to your identity provider’s servers. It never touches the container. The agent only handles a short temporary code to give you, and a token issued by the provider once you have authenticated.

The device code is one-time and short-lived. After the user authenticates, the code is permanently invalidated. An intercepted code is useless without the user’s credentials and MFA.

Scopes define the ceiling. You configure the OAuth app or app registration to request only specific permissions. The agent cannot exceed those scopes. If you configure read-only access to email, the token cannot be used to send or delete email.

This is the authentication pattern that many of the consumer devices you already own use to access your accounts on your behalf. Your TV’s YouTube app, your smart home hub’s Google integration, the GitHub CLI you use on your workstation — these all use device flow. You have been trusting it for years without knowing what it was called.

How the implementation works

There are three components, and they run as separate Docker services that talk to each other through the Docker socket.

The app sidecar is your application container. It runs your CLI tool and your auth logic. It does nothing on its own. Its only job is to hold the authenticated token cache and execute commands on demand when OpenClaw calls into it.

OpenClaw is the AI agent container. It connects to Telegram, understands natural language, and knows which skills to run for which requests. It does not know anything about OAuth or your specific app. It just runs shell commands you specified and reports results back to you.

The Telegram auth bridge is a small script that lives inside the sidecar. It registers a device code callback, requests an auth flow, and uses the Telegram Bot API to forward the code to you — then confirms when authentication completes.

The compose file looks roughly like this:

```yaml
services:
  myapp-sidecar:
    image: ghcr.io/youruser/myapp:latest
    command: ["sleep", "infinity"]
    environment:
      - XDG_DATA_HOME=/data
      - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
      - TELEGRAM_CHAT_ID=${TELEGRAM_CHAT_ID}
    volumes:
      - ${APP_DATA_DIR:-./app-data}:/data
    restart: unless-stopped

  openclaw-gateway:
    build:
      context: .
      dockerfile: Dockerfile.openclaw
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./skills:/app/skills
    group_add:
      - "${DOCKER_GID:-999}"
    environment:
      - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
    ports:
      - "18789:18789"
    restart: unless-stopped
```

The Callback Pattern

The auth manager has a method called set_device_code_callback. You pass it a function, and when the device code is ready, the auth manager calls your function with the code and the verification URL rather than trying to open a browser.

```python
class AuthManager:
    def __init__(self, client_id: str, authority: str) -> None:
        self._client_id = client_id
        self._authority = authority.rstrip("/")
        self._on_device_code = None
        self._tokens = self._load_cache()

    def set_device_code_callback(self, fn) -> None:
        self._on_device_code = fn

    async def _device_code_auth(self) -> None:
        async with httpx.AsyncClient() as client:
            resp = await client.post(
                f"{self._authority}/oauth2/v2.0/devicecode",
                data={"client_id": self._client_id, "scope": OAUTH_SCOPES},
                timeout=30,
            )
        flow = resp.json()
        user_code = flow["user_code"]
        verification_uri = flow["verification_uri"]
        device_code = flow["device_code"]
        interval = flow.get("interval", 5)
        expires_in = flow.get("expires_in", 300)

        if self._on_device_code:
            self._on_device_code({
                "user_code": user_code,
                "verification_uri": verification_uri,
                "message": flow.get("message", ""),
            })
        else:
            try:
                webbrowser.open(verification_uri)
            except Exception:
                pass

        deadline = time.time() + expires_in
        while time.time() < deadline:
            await asyncio.sleep(interval)
            async with httpx.AsyncClient() as client:
                resp = await client.post(
                    f"{self._authority}/oauth2/v2.0/token",
                    data={
                        "client_id": self._client_id,
                        "grant_type": "urn:ietf:params:oauth:grant-type:device_code",
                        "device_code": device_code,
                    },
                    timeout=30,
                )
            body = resp.json()
            if resp.status_code == 200 and "access_token" in body:
                self._store_tokens(body)
                return
            error = body.get("error", "")
            if error == "authorization_pending":
                continue
            elif error == "slow_down":
                interval += 5
            elif error in ("authorization_declined", "expired_token"):
                raise AuthError(error)
            else:
                raise AuthError(body.get("error_description", "Auth failed"))
```

The polling loop handles the three error codes the spec defines: authorization_pending (keep waiting), slow_down (back off and increase the interval by 5 seconds), and the terminal errors that stop polling.

The Telegram Bridge

With the callback mechanism in place, the Telegram bridge is just a script that registers a callback and uses the Telegram Bot API to deliver the code to you:

```python
import asyncio, os, sys, traceback
import httpx
from your_app.auth import AuthManager
from your_app.config import Settings

TELEGRAM_BOT_TOKEN = os.environ["TELEGRAM_BOT_TOKEN"]
TELEGRAM_CHAT_ID = os.environ["TELEGRAM_CHAT_ID"]

async def send_telegram(text: str) -> None:
    url = f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage"
    async with httpx.AsyncClient() as client:
        await client.post(url, json={
            "chat_id": TELEGRAM_CHAT_ID,
            "text": text,
            "parse_mode": "Markdown",
        }, timeout=30)

async def main() -> int:
    settings = Settings()
    auth = AuthManager(settings.client_id, settings.authority)

    if auth.is_signed_in():
        await send_telegram("Already authenticated.")
        return 0

    def on_device_code(info: dict) -> None:
        msg = (
            "*Auth Required*\n\n"
            f"Go to: {info['verification_uri']}\n"
            f"Enter code: `{info['user_code']}`\n\n"
            "Complete sign-in in your browser and I'll confirm when done."
        )
        try:
            asyncio.get_event_loop().create_task(send_telegram(msg))
        except RuntimeError:
            import requests
            requests.post(
                f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage",
                json={"chat_id": TELEGRAM_CHAT_ID, "text": msg, "parse_mode": "Markdown"},
                timeout=10,
            )

    auth.set_device_code_callback(on_device_code)
    try:
        user_info = await auth.sign_in()
        await send_telegram(f"*Signed in* as {user_info['name']} ({user_info['email']})")
        return 0
    except Exception as exc:
        await send_telegram(f"*Auth failed:* {exc}\n```{traceback.format_exc()[:800]}```")
        return 2

if __name__ == "__main__":
    sys.exit(asyncio.run(main()))
```

The OpenClaw Skill

On the OpenClaw side, a “skill” is a markdown file. When OpenClaw sees a trigger phrase in Telegram, it runs the associated shell command. The markdown file looks like this:

—
name: myapp-auth
description: Authenticate with the cloud service via device code flow, coordinated through Telegram.
—

## myapp-auth

Use this skill when the user sends “/auth”, “authenticate”, or “sign in”.

Run the following command and wait up to 360 seconds for it to complete.
The script sends the device code to the user via Telegram and confirms when done.

docker exec myapp-sidecar-1 python /app/scripts/telegram_auth.py

Summary

Device code flow is a mature, widely-supported OAuth2 pattern that maps naturally onto OpenClaw running in Docker container. With this pattern OpenClaw never handles your credentials. It gets scoped, revocable tokens from the identity provider, after you have authenticated in your own browser. A Telegram bot is all you need to coordinate the handoff of the temporary code.

The approach works for many use-cases across Microsoft accounts, Google, GitHub, AWS, Auth0, Okta and others. It supports exactly the kinds of personal automations that make an OpenClaw genuinely useful in daily life.

References

AI Powered Bookshelf

Bookshelf is a Generative AI application built as a rudimentary, but fairly capable, RAG implementation written in python. It can use an open source LLM model (running locally or in the cloud) or a GPT model via OpenAI’s API.

The application is created using streamlit.
I used llama-index for orchestrating the loading of documents into the vector database. Only TokenTextSplitter is currently used. It does not optimize for PDF, html and other formats.
ChromaDb is the vector database to store the embedding vectors and metadata of the document nodes.
You can use any open source embeddings model from HuggingFace.
Bookshelf will automatically use the GPU when creating local embeddings, if the GPU is available on your machine.
You can use OpenAI embeddings as well. There is no way to use a specific OpenAI embedding model or configure the parameters yet.
Use OpenAI API or any OpenAI compatible LLM API (using LMStudio, Ollama or text-generation-webui) of your choice.
There is a live demo on streamlit cloud – https://bookshelf.streamlit.app/
The demo allows only OpenAI integration. You can run it locally for accessing Open Source embedding models and LLMs.

Live demo – https://bookshelf.streamlit.app/

You will need your OpenAI api key for the demo.

If you are running it locally, you will have the option of using an Open Source LLM instance via an API Url. In the screenshot, I am using an open source Embedding Model from HuggingFace (sentence-transformers/all-mpnet-base-v2) and The local LLM server at http://localhost:1234/v1

Collections tab shows all collections in the database. It also shows the names of all the files in the selected collection. You can inspect individual chunks for the metadata and text of each chunk. You can delete all contents of the collection (there is no warning).

You can modify the collection name to create a new collection. Multiple files can be uploaded at the same time. You can specify if you want to extract metadata from the file contents. Enabling this option can add significant cost because it employs Extractors which use LLM to generate title, summaries, keywords and questions for each document.

On the Retrieve tab, you can query chunks which are semantically related to your query.

On the Prompt tab, you can prompt your LLM. The context as well as the Prompt Template is editable.

Here is an example of using the context retrieved from chunks in the Vector database to query the LLM.

This inference was performed using Phi3 model running locally on LMStudio.

Code is on Github – https://github.com/ashtewari/bookshelf

Have fun!

KeyNode with Node.js and Microsoft Azure

KeyNode is a application to issue and verify software license keys. Technology stack for KeyNode is Node.js, MongoDB and Microsoft Azure.

I had built this functionality with C9.io (a cloud-based IDE with a built-in source code repository and debugger), mongohq (MongoDB as a service – now part of compose.io) and appfog (Cloud PAAS built on top of CloudFoundry). It used SMTP/gmail to email license files. That was the version I created a couple of years ago to issue tamper-proof signed xml license files for CodeDemo (a code snippet tool for developers, presenters and instructors).

For KeyNode (open source) I switched to a different toolset : Visual Studio Code and Windows Azure, simplified the code to remove signed xml file and open-sourced it on GitHub. Signed xml allowed offline verification in CodeDemo (a Wpf/Desktop app). Removing signed xml requires verification to happen online. I am working on adding the web endpoint for verification of license keys. This version uses SendGrid to email license keys. KeyNode is deployed as a Windows Azure Web App. The Azure Web App is on Continuous Deployment feed from the source code repository on GitHub.

I created and tested this Node.js application locally without IIS and deployed it as an Azure Web App without making any changes to the code at all. Node.js applications are hosted in Azure under IIS with iisnode. Iisnode is a native IIS module that allows hosting of node.js applications in IIS on Windows. Read more about iisnode here. Iisnode architecture also makes it significantly easier to take advantages of scalability afforded by Azure.

KeyNode is a work in progress. My plan is to use this as the basis for further explorations in the following areas :

DevOps, Docker and Microservices (at miniature scale of course!)
Create a Web UI with Express (a Node.js web application framework)
Integrate with Azure Storage/Queues
and more…

I invite you to check out the live site on Azure and fork it for your own experiments : KeyNode on GitHub.

Resources :

Photo Credit : Piano Keyboard (www.kpmalinowski.pl)

The Site44 Workflow

A light weight development workflow with real-time website deployment.

I recently built a sample website to illustrate how clean, semantic html markup can be maintained when using Bootstrap’s grid system. The solution is to use a css pre-processor to incorporate Bootstrap’s LESS based mixins into your own .less files and push the Bootstrap instructions down into your stylesheets. There are two ways to “compile” .less stylesheets – use a stand-alone LESS compiler or use less.js. I found it very convenient to use less.js (note that it is not recommended in production deployment). As I started working on developing the sample code I found it a bit cumbersome to work with an entire web application project in Visual Studio, considering I was working with some really simple sample client-side html, css. As I craved for an alternative, I stumbled on to a development workflow that is incredibly simple and a lot of fun. I call it the Site44 workflow. Site44 turns your dropbox folders into websites. And it is awesome! Here is what you do –

1. Sign into Site44.com using your dropbox credentials.

2. Create a new site (all you have to do is come up with a name). I named it “ash”. A sub-folder with this name will show up in your dropbox folder.

3. Drag this folder to your Github for Windows screen and drop it there to create a github repo in that folder and push it to github.

4. Smile and write code.

As you save your code. The changes are deployed in real-time to your website. You commit to your github repo as you please. If you revert to a different version/branch of our code from your git repo, that version will be deployed (almost) instantly to your website. I wish there was a .site44ignore feature in Site44, just like .gitignore. That will allow me to keep my .git folder (and some other files) from getting published to the website. Other than that, this worked out really well for me.

I wrote about the experience of extending Bootstrap with LESS here : Bootstrap with LESS.

Hat tip to Justin Saraceno for introducing me to site44.

Protecting Your Api Keys

I am working on a Windows 8 app (details to follow in a subsequent post) and the code is published in a public repo on github. My app uses third-party APIs and after I committed the first cut to github, I realized that I had included my api keys in the code. The whole world had access to my keys. I did not want to publish the developer keys for those APIs to the entire world.

When the app will be released and distributed, those keys will need to be included in the app somehow. Once the keys are out there they can not be 100% protected from a determined mind. So, why bother? Why would I want to hide the api keys in the source code? Here are some good reasons –

1. It might be illegal to put the keys out there in plain sight for the whole world to see.
2. Developer keys may be throttled or have other restrictions on how many times they can be used per day or per minute.
3. The keys might allow access to expensive cloud computing resources.
4. The keys might allow access to confidential/sensitive customer data.

First, I had to take my keys back from git repo. Can you really remove information from a public git repository? Yes, you can, using git filter-branch. Here is how – https://help.github.com/articles/remove-sensitive-data. It worked! I successfully rewrote the history! My past commits don’t have those file(s) anymore that had my private api keys.

Next, I made sure that I don’t make this mistake again –

1. I added a new file ApiKeys.cs to the project.
2. Exposed the api keys as constants from a static class in this new file.
3 Added ApiKeys.cs in .gitignore file, to prevent this file from being committed to the repository.
4. Added instructions in ReadMe.txt for external developers to include their own keys.

This is not an ideal solution. If you are using a continuous build server, this technique will obviously not work. The code will not compile as-is, a file must be added to the project before it will start compiling. This works for me for now, but I am still looking for a better solution.

Solution to the fetch puzzle

Here is a brute force solution to the fetch problem –

Basically, at each step there are three possibilities :

1. You can fill a bucket.

2. You can transfer water from one bucket to the other one.

3. You can dump out the water from a bucket.

In this brute force solution, I try each one of these steps and then try all three again after each one of the previous steps. And on

and on untill I get the required amount of water in one of the buckets.

Check it out. Source code is on my github repo –

https://github.com/ashtewari/fetch

Here is a brute force solution to the fetch puzzle.

The puzzle goes like this – You have two buckets. A 3 gallon bucket and a 5 gallon bucket. Buckets are not marked or graduated. You are to fetch 4 gallons of water in a single trip to the river. How will you do it?

Basically, at each step there are three possibilities :

You can fill a bucket.
You can transfer water from one bucket to the other one.
You can dump out the water from a bucket.

In this brute force solution, I try each one of these steps and then try all three again after each one of the previous steps. And on and on until I get the required amount of water in one of the buckets.

Check it out. Source code is on my github repo – https://github.com/ashtewari/fetch