Giving OpenClaw Secure Access to Cloud Services Without Sharing Your Password

Device Code Flow has been built into OAuth2 for years, originally designed for TVs and game consoles. It works just as well for a Docker container. It requires no credentials stored on the agent machine. It gives you narrow, revocable access that the agent cannot exceed.

I have an OpenClaw agent running in a docker container on a dedicated host. I communicate with it via a Telegram bot.

I wired the device-code authentication pattern up with OpenClaw and Microsoft services, but the authentication approach works with any app and any command-line tool you want to run in a headless environment with many services that you use everyday.

What Is Device Code Flow?

OAuth2 device code flow (RFC 8628) was designed for “input-constrained devices” — things without a keyboard or a browser. Think of how you sign in to Netflix on a smart TV: a short code appears on screen, you visit a URL on your phone, you type the code in, and the TV logs in. You never type your password on the TV.

The official name for that pattern is the device authorization grant. A Docker container is, from the protocol’s perspective, exactly the same kind of device. It has no browser. It cannot perform an interactive redirect. But it can make HTTP requests, and that is all it needs.

The flow has three steps:

1. The app posts to the identity provider’s device code endpoint. It gets back a short user code (like “WDJB-MJHT”), a verification URL, and a polling device code.

2. The user visits the URL on any browser, on any device — their phone, laptop, anything — and enters the short code. They sign in normally, with their usual credentials and MFA if it is enabled.

3. The app polls the token endpoint in the background. Once the user finishes signing in, the poll returns a real access token and a refresh token. The app stores these and uses them for API calls going forward.

The app never sees the password. The password goes directly from the user’s browser to the identity provider. The app only ever handles two things: a short temporary code that it sends to the user, and a token that the identity provider gives back once the user has authenticated.

OpenClaw running in a Docker container fits this model exactly.

Which Services Support this?

Device code flow is not a Microsoft-only feature. It is a standard OAuth2 extension (RFC 8628) and most major identity providers support it — including Microsoft, Google, GitHub, and AWS. If a service uses Okta or Auth0 for identity, those support it too.

What can you do with this?

An AI agent signed into Microsoft account through device code flow can do a lot more than people usually expect, even with read-only permissions. With the right delegated Microsoft Graph scopes, it can read your OneDrive files, check your calendar, inspect your inbox metadata, and look through your contacts, all without needing your password stored on the machine.

  • You can ask OpenClaw to find likely folders or files in OneDrive with prompts like “Find the folder where I put last year’s tax documents,” because Graph supports searching drive items and Files.Read is available for personal Microsoft accounts.
  • You can ask what is on your calendar tomorrow afternoon and get a time-bounded answer from calendarView without granting write access, using Calendars.ReadBasic or Calendars.Read.
  • You can ask whether you have unread email from a specific sender like your bank, airline, or school, because Graph supports filtering messages by sender and unread status, and Mail.ReadBasic is available for personal Microsoft accounts.
  • You can ask for upcoming birthdays this month from your Outlook contacts, because contacts expose a birthday field and Contacts.Read is available for personal Microsoft accounts.
  • You can turn OpenClaw into a read-only life dashboard that checks your next appointments, recent files, important unread mail, and upcoming birthdays from one signed-in identity.
  • You can start with read-only scopes and expand later only if you want more action-taking behavior, such as creating calendar events, editing OneDrive files, sending mail, or updating OneNote. Microsoft’s permissions reference exposes those broader delegated scopes separately, so the permission boundary is explicit.

None of this requires writing a password anywhere. The agent has a scoped access token which can be revoked anytime.

In Google Services, you could potentially do these –

  • You can drop a file into an “OpenClaw Inbox” and have OpenClaw read it, summarize it, extract action items, or answer questions about it without giving the app access to your whole Drive.
  • You can share a handful of approved documents with OpenClaw and turn it into a focused research assistant that searches only those files by title or content when you ask questions.
  • You can have OpenClaw draft and revise Google Docs for you so it can turn rough notes into polished writeups, reports, outlines, or meeting summaries inside documents you’ve explicitly approved.
  • You can let OpenClaw store its own private memory in your Google account so it remembers settings, progress, checkpoints, and past work without cluttering your visible Drive.
  • You can have OpenClaw manage custom YouTube playlists for you so it builds queues like “Watch This Week,” “Learn Later,” or topic-based collections without manual sorting.

The Security Architecture

Here is what makes this better than the alternatives. The agent never sees your credentials. Your password is typed in your browser, on your device, to your identity provider’s servers. It never touches the container. The agent only handles a short temporary code to give you, and a token issued by the provider once you have authenticated.

The device code is one-time and short-lived. After the user authenticates, the code is permanently invalidated. An intercepted code is useless without the user’s credentials and MFA.

Scopes define the ceiling. You configure the OAuth app or app registration to request only specific permissions. The agent cannot exceed those scopes. If you configure read-only access to email, the token cannot be used to send or delete email.

This is the authentication pattern that many of the consumer devices you already own use to access your accounts on your behalf. Your TV’s YouTube app, your smart home hub’s Google integration, the GitHub CLI you use on your workstation — these all use device flow. You have been trusting it for years without knowing what it was called.

How the implementation works

There are three components, and they run as separate Docker services that talk to each other through the Docker socket.

The app sidecar is your application container. It runs your CLI tool and your auth logic. It does nothing on its own. Its only job is to hold the authenticated token cache and execute commands on demand when OpenClaw calls into it.

OpenClaw is the AI agent container. It connects to Telegram, understands natural language, and knows which skills to run for which requests. It does not know anything about OAuth or your specific app. It just runs shell commands and reports results back to you.

The Telegram auth bridge is a small script that lives inside the sidecar. It registers a device code callback, requests an auth flow, and uses the Telegram Bot API to forward the code to you — then confirms when authentication completes.

The compose file looks roughly like this:

```yaml
services:
  myapp-sidecar:
    image: ghcr.io/youruser/myapp:latest
    command: ["sleep", "infinity"]
    environment:
      - XDG_DATA_HOME=/data
      - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
      - TELEGRAM_CHAT_ID=${TELEGRAM_CHAT_ID}
    volumes:
      - ${APP_DATA_DIR:-./app-data}:/data
    restart: unless-stopped

  openclaw-gateway:
    build:
      context: .
      dockerfile: Dockerfile.openclaw
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./skills:/app/skills
    group_add:
      - "${DOCKER_GID:-999}"
    environment:
      - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
    ports:
      - "18789:18789"
    restart: unless-stopped
```

The Callback Pattern

The auth manager has a method called set_device_code_callback. You pass it a function, and when the device code is ready, the auth manager calls your function with the code and the verification URL rather than trying to open a browser.

```python
class AuthManager:
    def __init__(self, client_id: str, authority: str) -> None:
        self._client_id = client_id
        self._authority = authority.rstrip("/")
        self._on_device_code = None
        self._tokens = self._load_cache()

    def set_device_code_callback(self, fn) -> None:
        self._on_device_code = fn

    async def _device_code_auth(self) -> None:
        async with httpx.AsyncClient() as client:
            resp = await client.post(
                f"{self._authority}/oauth2/v2.0/devicecode",
                data={"client_id": self._client_id, "scope": OAUTH_SCOPES},
                timeout=30,
            )
        flow = resp.json()
        user_code = flow["user_code"]
        verification_uri = flow["verification_uri"]
        device_code = flow["device_code"]
        interval = flow.get("interval", 5)
        expires_in = flow.get("expires_in", 300)

        if self._on_device_code:
            self._on_device_code({
                "user_code": user_code,
                "verification_uri": verification_uri,
                "message": flow.get("message", ""),
            })
        else:
            try:
                webbrowser.open(verification_uri)
            except Exception:
                pass

        deadline = time.time() + expires_in
        while time.time() < deadline:
            await asyncio.sleep(interval)
            async with httpx.AsyncClient() as client:
                resp = await client.post(
                    f"{self._authority}/oauth2/v2.0/token",
                    data={
                        "client_id": self._client_id,
                        "grant_type": "urn:ietf:params:oauth:grant-type:device_code",
                        "device_code": device_code,
                    },
                    timeout=30,
                )
            body = resp.json()
            if resp.status_code == 200 and "access_token" in body:
                self._store_tokens(body)
                return
            error = body.get("error", "")
            if error == "authorization_pending":
                continue
            elif error == "slow_down":
                interval += 5
            elif error in ("authorization_declined", "expired_token"):
                raise AuthError(error)
            else:
                raise AuthError(body.get("error_description", "Auth failed"))
```

The polling loop handles the three error codes the spec defines: authorization_pending (keep waiting), slow_down (back off and increase the interval by 5 seconds), and the terminal errors that stop polling.

The Telegram Bridge

With the callback mechanism in place, the Telegram bridge is just a script that registers a callback and uses the Telegram Bot API to deliver the code to you:

```python
import asyncio, os, sys, traceback
import httpx
from your_app.auth import AuthManager
from your_app.config import Settings

TELEGRAM_BOT_TOKEN = os.environ["TELEGRAM_BOT_TOKEN"]
TELEGRAM_CHAT_ID = os.environ["TELEGRAM_CHAT_ID"]

async def send_telegram(text: str) -> None:
    url = f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage"
    async with httpx.AsyncClient() as client:
        await client.post(url, json={
            "chat_id": TELEGRAM_CHAT_ID,
            "text": text,
            "parse_mode": "Markdown",
        }, timeout=30)

async def main() -> int:
    settings = Settings()
    auth = AuthManager(settings.client_id, settings.authority)

    if auth.is_signed_in():
        await send_telegram("Already authenticated.")
        return 0

    def on_device_code(info: dict) -> None:
        msg = (
            "*Auth Required*\n\n"
            f"Go to: {info['verification_uri']}\n"
            f"Enter code: `{info['user_code']}`\n\n"
            "Complete sign-in in your browser and I'll confirm when done."
        )
        try:
            asyncio.get_event_loop().create_task(send_telegram(msg))
        except RuntimeError:
            import requests
            requests.post(
                f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage",
                json={"chat_id": TELEGRAM_CHAT_ID, "text": msg, "parse_mode": "Markdown"},
                timeout=10,
            )

    auth.set_device_code_callback(on_device_code)
    try:
        user_info = await auth.sign_in()
        await send_telegram(f"*Signed in* as {user_info['name']} ({user_info['email']})")
        return 0
    except Exception as exc:
        await send_telegram(f"*Auth failed:* {exc}\n```{traceback.format_exc()[:800]}```")
        return 2

if __name__ == "__main__":
    sys.exit(asyncio.run(main()))
```

The OpenClaw Skill

On the OpenClaw side, a “skill” is a markdown file. When OpenClaw sees a trigger phrase in Telegram, it runs the associated shell command. The markdown file looks like this:


name: myapp-auth
description: Authenticate with the cloud service via device code flow, coordinated through Telegram.


## myapp-auth

Use this skill when the user sends “/auth”, “authenticate”, or “sign in”.

Run the following command and wait up to 360 seconds for it to complete.
The script sends the device code to the user via Telegram and confirms when done.

docker exec myapp-sidecar-1 python /app/scripts/telegram_auth.py

Summary

Device code flow is a mature, widely-supported OAuth2 pattern that maps naturally onto OpenClaw running in Docker container. With this pattern OpenClaw never handles your credentials. It gets scoped, revocable tokens from the identity provider, after you have authenticated in your own browser. A Telegram bot is all you need to coordinate the handoff of the temporary code.

The approach works for many use-cases across Microsoft accounts, Google, GitHub, AWS, Auth0, Okta and others. It supports exactly the kinds of personal automations that make an OpenClaw genuinely useful in daily life.

References

OpenflowSight – Log Analysis for Snowflake Openflow Telemetry

OpenflowSight is a Streamlit application for searching, filtering, and analyzing Openflow telemetry data, inspired by Azure App Insights Search UX.

It’s a log explorer focused on Openflow telemetry. It helps you quickly identify relevant events, see when they spiked, and which processors were involved. Search, group similar events, and export results for deeper offline analysis.

Openflow is an exciting new unified data integration tool from Snowflake based on Apache NiFi. Logs, traces, and metrics emitted at runtime are written to the Event table which supports the OpenTelemetry data model. OpenflowSight surfaces this data to enable convenient, efficient monitoring and analysis of Openflow operations from a graphical dashboard—instead of using SQL queries.

Key Features

  • Runtime Filtering — Select multiple runtimes, toggle system runtimes, filter by processor and log level
  • Smart Search — Multi-term search (comma-separated) with contains, regex, and exact match modes
  • Timeline View — Interactive histogram with zoom/pan, configurable time buckets (1 min to 1 hour), preset windows (1h/6h/24h/7d)
  • Grouped Patterns — Fuzzy clustering that normalizes dynamic values (timestamps, UUIDs) for accurate pattern grouping
  • Individual Logs — Excel-like grid with sorting, filtering, pagination, and multi-select
  • CSV Export — One-click export with smart file naming

Tech Stack

Open Source

OpenflowSight is open-source and contributions are welcome—features, fixes, docs, anything. If it helps you debug Openflow pipelines faster, that’s a win.

View on GitHub →

Understanding RAG @ All Things Open AI 2025

What I learned while creating Bookshelf (open source RAG Application)

I had the opportunity to speak at the AllThingsOpen.ai 2025 Conference recently. It was a wonderful experience to learn from and share with so many amazing fellow presenters and attendees. All Things Open AI 2025 session recordings are available now.

I spoke about several key aspects of Retrieval-Augmented Generation (RAG) and vector databases. The main topics I addressed included The Core Concepts of RAG, Embedding Models vs. Inference Models, Handling Multiple Embeddings and various Optimizations Beyond Naive RAG such as chunking strategies, enrichment with metadata, auto-merging retriever, and reranking models. Finally, I touched on Local Execution and GPU Utilization.

You can watch the recording here: YouTube – How I created Bookshelf: An Open Source AI-Powered Personal Knowledge Base – Ash Tewari

The slides can be downloaded from here: (Google Drive Link)

References and Links:

SPCHR: Speech To Text App for Windows

I’m excited to announce the release of SPCHR. Pronounced as “speaker”. I know, I know. My apologies, it was too late at night and that’s the best I could come up with 😉

  • It is a Windows desktop application for speech-to-text transcription.
  • You can use it to “add” voice input to applications that don’t natively support it. It works anywhere you can type or paste text.
  • You can use it entirely locally on your PC if you like, so it is private and secure.

Key Features

  • Flexible: SPCHR can use either Azure Speech Services or a local OpenAI Whisper model, giving you the flexibility to choose between cloud-based and local transcription.
  • Zero Configuration: The application works immediately with local Whisper processing – no account setup required!
  • Global Hotkey: Start and stop recording from any application with Ctrl+Alt+L.
  • Seamless Integration: Transcribed text is automatically pasted into your active window.

Getting Started

  • Clone the repository
  • Build the solution using Visual Studio and run it
  • Press Ctrl+Alt+L and start speaking
  • Watch as your words appear in your active window
  • For those wanting cloud-based transcription, simply add your Azure Speech Services credentials to the configuration file.

Open Source

SPCHR is released under the MIT License, and I welcome contributions from the community. Whether you’re interested in adding features, fixing bugs, or improving documentation, your help is appreciated.

Check out GitHub repository

What’s next

This is just the beginning for SPCHR. Several improvements can be done:

  • Additional language support
  • Customizable hotkeys
  • Installer
  • UI Enhancements
  • AI Enhanced Features

Try it out

Ready to transform how you interact with your computer? Visit the GitHub repository to get started with SPCHR today.

I’m excited to see how you’ll use SPCHR in your workflow and look forward to your feedback!

WhichBox – AI Assistant App

Do you remember which one of those boxes in the garage has your old phone? Or that toy your child wants to play with again? Or the tripod you need for your weekend trip? Oh what about that massager that you could really use right now! You would probably have to dig through those boxes to find it. Same problem when moving. There are always some boxes that you don’t want to open right away, but you wish you could open just the right one when you need something.

I have created WhichBox – your AI Assistant to find that box. It uses latest Vision AI models to help you find things quickly. Here’s what you do –

  1. Take pictures of the content inside each labelled box to create an inventory.
  2. Use WhichBox to easily identify the box containing the item you are looking for.

You can check it out here – https://whichbox.streamlit.app/

The demo has four photos of labelled boxes with some content in them. Note that there can be multiple photos of the same box. You can take photos as you are filling up the box to capture things at the bottom.

You can ask for a specific thing, like “Find Fitbit” or just “fitbit”

You can go for a general category, like “Camera Equipment”

You can get all boxes containing “USB Adapters”

You can look for “Groot” or if you can’t remember the name of a specific toy then you can look for “all toys of movie characters”

For now, you can bring your own API Key and use your own photos to try this out.

WhichBox : https://whichbox.streamlit.app/

AI Powered Bookshelf

Bookshelf is a Generative AI application built as a rudimentary, but fairly capable, RAG implementation written in python. It can use an open source LLM model (running locally or in the cloud) or a GPT model via OpenAI’s API.

  • The application is created using streamlit.
  • I used llama-index for orchestrating the loading of documents into the vector database. Only TokenTextSplitter is currently used. It does not optimize for PDF, html and other formats.
  • ChromaDb is the vector database to store the embedding vectors and metadata of the document nodes.
  • You can use any open source embeddings model from HuggingFace.
  • Bookshelf will automatically use the GPU when creating local embeddings, if the GPU is available on your machine.
  • You can use OpenAI embeddings as well. There is no way to use a specific OpenAI embedding model or configure the parameters yet.
  • Use OpenAI API or any OpenAI compatible LLM API (using LMStudio, Ollama or text-generation-webui) of your choice.
  • There is a live demo on streamlit cloud – https://bookshelf.streamlit.app/
  • The demo allows only OpenAI integration. You can run it locally for accessing Open Source embedding models and LLMs.

Live demo – https://bookshelf.streamlit.app/

You will need your OpenAI api key for the demo.

If you are running it locally, you will have the option of using an Open Source LLM instance via an API Url. In the screenshot, I am using an open source Embedding Model from HuggingFace (sentence-transformers/all-mpnet-base-v2) and The local LLM server at http://localhost:1234/v1

Collections tab shows all collections in the database. It also shows the names of all the files in the selected collection. You can inspect individual chunks for the metadata and text of each chunk. You can delete all contents of the collection (there is no warning).

You can modify the collection name to create a new collection. Multiple files can be uploaded at the same time. You can specify if you want to extract metadata from the file contents. Enabling this option can add significant cost because it employs Extractors which use LLM to generate title, summaries, keywords and questions for each document.

On the Retrieve tab, you can query chunks which are semantically related to your query.

On the Prompt tab, you can prompt your LLM. The context as well as the Prompt Template is editable.

Here is an example of using the context retrieved from chunks in the Vector database to query the LLM.

This inference was performed using Phi3 model running locally on LMStudio.

Code is on Github – https://github.com/ashtewari/bookshelf

Have fun!