Building an AI home assistant on top of OpenClaw and Slack

Why not just use Siri

I have an iPhone. I have HomePods. Siri exists. So why did I build a text-based home assistant?

Honestly, it started as a "because I can" project. I had OpenClaw running with 20-something agents for work stuff, and one evening I was lying on the couch too lazy to get up and turn off the kitchen lights. I could have said "Hey Siri, turn off the kitchen lights." But I already had Slack open on my phone. So I thought: what if I just typed it?

That was the beginning of Lumen, my Smarthome agent. What started as a lazy experiment turned into something I use every day. Not because it's better than Siri for simple commands. It's not. But because it fits into how I already work, and it can do things Siri can't.

What it controls

The Smarthome agent connects to three systems in my apartment:

Philips Hue for lighting. I have spots in the kitchen (three bulbs), and lights throughout the apartment. Each light and room is individually controllable. Brightness, color temperature, on/off, scenes.

Sonos for audio. Four speakers: bathroom (10.0.1.132), bedroom (10.0.1.124), kitchen/dining (10.0.1.87), and loft (10.0.1.6). Play, pause, volume, grouping, and Spotify integration.

Eight Sleep for the mattress. Temperature control for the bed, sleep tracking. This one is less about daily control and more about automation: cool the bed before bedtime, warm it up before the alarm goes off.

All three systems have their own apps. All three apps work fine. The value of connecting them to OpenClaw isn't replacing those apps. It's unifying them into one interface that I'm already using all day.

The integration architecture

Each device system connects through its own CLI tool, which the agent accesses via MCP (Model Context Protocol):

openhue -- a CLI that talks to the Philips Hue Bridge on my local network
sonos -- a CLI for Sonos speaker control
A Spotify integration script (spotify-control.js) that uses the full Spotify Web API

The agent also has blu for BluOS-compatible players, though I use that less frequently.

When I type "turn off the kitchen lights" in the #ai-lumen-smarthome Slack channel, here's what happens:

OpenClaw routes the message to the Smarthome agent
The agent parses the intent (turn off, kitchen, lights)
It calls the openhue CLI: something like openhue set lights --room kitchen --state off
The CLI sends the command to the Hue Bridge over the local network
The lights turn off
The agent confirms in Slack: "Kitchen lights off."

Total time: about 3-4 seconds. Faster than getting up. Slower than Siri. But I was already looking at my phone.

For Sonos, a typical command might be "play some jazz in the kitchen" which the agent translates into a Spotify search for a jazz playlist, then routes it to the kitchen Sonos speaker. The Spotify integration handles the authentication (my account is "wolof85," Premium tier, so full API access) and the Sonos CLI handles the speaker routing.

What's actually useful

I was surprised by what ended up being the most valuable use cases. It wasn't the simple on/off commands. It was the compound ones.

Multi-device routines. "Movie mode" dims the living room lights to 20%, sets them to warm white, pauses any playing music, and turns off the kitchen lights. Doing this manually means opening the Hue app, adjusting two rooms, then opening the Sonos app to pause. In Slack, it's one message.

Remote control from anywhere. This is the one that sold me. I was at choir rehearsal (Oslokoret, Wednesday evenings) and realized I'd left the kitchen lights on. Opened Slack, typed "turn off all lights," done. No need for the Hue app, no need to be on my home WiFi. Because OpenClaw runs on my Mac Studio at home, and Slack works from anywhere, I have home control wherever I have internet.

Before you say "the Hue app can do that remotely too" -- yes, it can. But I'm already in Slack. The context switch to a different app, finding the right room, tapping the right buttons -- it's all friction. In Slack, it's a sentence.

Natural language flexibility. I don't need to remember the exact room names or device IDs. I can say "make the bedroom warmer" and the agent figures out I'm talking about the Eight Sleep mattress, not the Hue lights. I can say "it's too bright in here" and it dims whatever room context suggests (usually based on my last command or the time of day).

Status checks. "Are any lights on?" gives me a quick report. "What's playing?" tells me what's on the Sonos. "What's my bed temperature set to?" checks the Eight Sleep. Handy when I'm in bed and don't want to open three different apps.

The moment it clicked

I was working late one night, deep in a coding session with Forge on the PortLink API. Forge is my engineering agent -- we were debugging a tricky auth flow. I was in the #ai-forge Slack channel going back and forth on the code.

Without switching channels or opening any app, I opened #ai-lumen-smarthome in a new Slack window, typed "dim kitchen to 30, play something chill in the loft," and went back to the coding session. Ten seconds later, the kitchen dimmed and lo-fi beats started playing upstairs.

I didn't break my flow. I didn't leave the tool I was already in. I didn't touch my phone. The home just responded to a text message the same way a colleague responds when you ask them to grab you a coffee.

That's when I realized this wasn't just a novelty. Having everything in one communication platform, work agents and home agents, means you never context-switch. Your home becomes another team member in your Slack workspace.

Where it falls short

Let me be honest about the limitations, because they're real.

Latency. Siri responds in about 1 second for a light command. My system takes 3-4 seconds. That's the round trip from Slack to OpenClaw to the agent to the CLI to the Hue Bridge and back. For most things, 3 seconds is fine. For "I just walked into a dark room and need light now," it's annoying. I still use the physical light switch for those moments.

Complex audio commands. "Play the song that goes 'da da da dum dum'" doesn't work. The agent isn't great at ambiguous music requests. Specific commands work well: "play Daft Punk in the kitchen," "play my Discover Weekly on the loft speaker." Vague ones fail. The Spotify search API returns something, but it's often not what I meant.

Voice would be faster for simple things. "Hey Siri, lights off" is 2 seconds including the wake word. Typing the same in Slack is maybe 8 seconds. For simple binary commands, voice wins. The value of the text-based system is for compound commands, remote access, and situations where voice isn't appropriate (in a meeting, in bed with someone sleeping, at a choir rehearsal).

No presence detection. Siri with HomePod can use your phone's location for presence-based automation. My system doesn't know where I am unless I tell it. "I'm home" could trigger a routine, but I'd have to type it. Automating that would require location API integration, which I haven't prioritized.

The Eight Sleep integration is limited. The Eight Sleep API isn't officially public. The third-party tools that exist for it work, but they don't expose everything the app does. I can set temperature and check sleep data, but some of the more advanced features (like the autopilot temperature adjustment) only work through the official app.

What I'd add next

A scheduled routines system. "At 10pm every night, dim all lights to 20% and set the bed to -2." Right now, I can type this manually each evening or set up a cron job in OpenClaw's scheduler. I'd like a more natural way to define recurring automations through conversation. Tell the agent once, it remembers and executes on schedule.

Better Sonos grouping. Right now, playing music across multiple speakers requires specifying each one. I'd like to define groups ("all speakers," "downstairs only") that the agent remembers.

Energy monitoring. My smart plugs report energy usage, but I haven't connected them yet. I'd like to ask "how much power did the apartment use today?" and get an answer.

Integration with weather data. "It's getting dark, close the blinds" -- except I don't have smart blinds yet. But combining weather/sunset data with home automation is an obvious next step.

Building your own

The barrier to entry is lower than you'd think if you already have smart home devices.

First, check if your devices have CLI tools or APIs. Philips Hue has several community CLI tools. Sonos has official and unofficial APIs. Most smart home brands have something.

Second, you need a way to route natural language to those tools. OpenClaw handles this with MCP, but you could build something similar with a simple script that parses commands and calls the right CLI. An LLM is good at parsing "dim the kitchen to 40%" into openhue set lights --room kitchen --brightness 40.

Third, you need a communication channel. Slack is what I use because it's already running. But it could be Telegram, Discord, a web interface, anything that lets you send text and receive responses.

The total setup time for my basic lighting control was about two hours. The Sonos integration took another hour. The Eight Sleep integration took longer because of the unofficial API situation.

Is it worth it? For the remote access and compound commands alone, yes. I've stopped thinking of it as a home automation project. It's just another agent in my workspace that happens to control physical things instead of digital ones. And that, honestly, is the most natural way to think about it. The home is just another system the AI manages, alongside email, accounting, and code.