Introduction v1.0.0

Welcome to VoxCMD, the powerful offline multi-agent voice command system designed to revolutionize the way developers and power users interact with their computers. This documentation will help you get started with VoxCMD and explore its many features.

What is VoxCMD?

VoxCMD is a comprehensive voice control system that operates entirely offline, ensuring privacy and security while providing advanced automation capabilities. By leveraging multiple specialized agents, VoxCMD can understand complex commands, execute system operations, assist with development tasks, and enhance productivity.

Privacy-Focused Design

Unlike many voice assistants, VoxCMD runs completely on your local machine without sending your voice data to external servers. This ensures your commands, code, and personal information remain private.

Key Features

  • Offline Operation: All processing happens locally on your machine.
  • Multi-Agent Architecture: Specialized agents handle different aspects of voice command processing.
  • Developer Tools: Code generation, project setup, debugging assistance, and more.
  • Productivity Suite: Calendar management, reminders, focus modes, and other productivity features.
  • System Automation: Launch applications, find files, monitor system resources.
  • Extensible Design: Create custom commands and integrate with other tools.

How VoxCMD Works

VoxCMD uses a multi-agent approach to process and execute voice commands:

  1. Voice Agent: Converts speech to text and identifies command patterns.
  2. Think Agent: Analyzes the command to determine intent and required actions.
  3. LLM Agent: Leverages local language models to understand complex requests and generate responses.
  4. Memory System: Maintains context and remembers preferences for personalized interactions.

Note: VoxCMD requires initial setup and configuration to work optimally with your system and workflow. Follow the installation and quick start guides to get started.

Installation

Follow these steps to install VoxCMD on your system. The installation process varies slightly depending on your operating system.

Prerequisites

Before installing VoxCMD, ensure your system meets the minimum requirements and has the following prerequisites installed:

  • Python 3.8 or higher
  • Node.js 14 or higher
  • Git
  • C++ build tools (for native dependencies)

Windows Installation

1
Download the Installer

Download the latest Windows installer from the official website.

2
Run the Installer

Double-click the downloaded .exe file and follow the installation wizard. Allow administrator permissions when prompted.

3
Install Required Models

After installation, VoxCMD will prompt you to download the necessary language and voice recognition models. This may take some time depending on your internet connection.

4
Configuration

Run the initial configuration wizard to set up VoxCMD according to your preferences.

Alternative: Using PowerShell
# Install using PowerShell iwr -useb https://install.voxcmd.dev/win | iex

macOS Installation

1
Download the DMG

Download the latest macOS DMG file from the official website.

2
Install the Application

Open the DMG file and drag VoxCMD to your Applications folder.

3
Grant Permissions

Launch VoxCMD and grant the required permissions for microphone access, accessibility features, and file system access when prompted.

4
Install Required Models

Follow the prompts to download and install the necessary language and voice recognition models.

Alternative: Using Homebrew
# Install using Homebrew brew install --cask voxcmd

Linux Installation

1
Install Dependencies

First, install the required dependencies using your distribution's package manager:

# For Debian/Ubuntu sudo apt update sudo apt install python3 python3-pip nodejs npm git build-essential libportaudio2 libasound2-dev # For Fedora sudo dnf install python3 python3-pip nodejs npm git gcc-c++ portaudio-devel
2
Clone the Repository
git clone https://github.com/voxcmd/voxcmd.git cd voxcmd
3
Run the Installation Script
./install.sh
4
Download Models and Configure

Follow the prompts to download models and configure the application.

Tip: For better microphone integration on Linux, consider installing PulseAudio or PipeWire if they're not already on your system.

Quick Start Guide

After installing VoxCMD, follow this quick start guide to get up and running with basic voice commands.

Launching VoxCMD

Start VoxCMD from your applications menu or by running the following command in your terminal:

# Start VoxCMD voxcmd start

Once launched, VoxCMD will run in the background, indicated by an icon in your system tray or menu bar.

Basic Voice Commands

VoxCMD uses "Hey VoxCMD" as the default wake word. Here are some basic commands to get started:

Command Action
"Hey VoxCMD, what can you do?" Lists available commands and capabilities
"Hey VoxCMD, open Chrome" Launches Google Chrome browser
"Hey VoxCMD, create a new folder called Projects" Creates a new directory named "Projects"
"Hey VoxCMD, what time is it?" Announces the current time
"Hey VoxCMD, create a new React project" Initializes a new React application
"Hey VoxCMD, remind me to check emails in 1 hour" Sets a reminder for 1 hour from now
"Hey VoxCMD, start focus mode for 25 minutes" Activates focus mode for a Pomodoro session

Command Structure

Most VoxCMD commands follow this general structure:

Wake Word
"Hey VoxCMD"
Action
"open", "create", "find", "remind", etc.
Object
"Chrome", "a new folder", "my calendar", etc.
Parameters (Optional)
"called Projects", "for 30 minutes", "with TypeScript", etc.

First-Time Setup

When you first launch VoxCMD, the setup wizard will guide you through the following steps:

  1. Voice Training: Train VoxCMD to recognize your voice by reading a few sample phrases.
  2. Environment Selection: Choose which development environments and tools you use most frequently.
  3. Integration Setup: Configure integrations with your calendar, task management systems, and development tools.
  4. Command Customization: Personalize your most frequently used commands.

Note: You can revisit this setup process at any time by saying "Hey VoxCMD, open settings" or using the system tray/menu bar icon.

System Requirements

To ensure optimal performance, your system should meet or exceed the following requirements:

Minimum Requirements

  • Operating System:
    • Windows 10 (64-bit) or later
    • macOS 11 (Big Sur) or later
    • Ubuntu 20.04, Fedora 34, or other modern Linux distributions
  • Processor: Quad-core CPU (Intel i5/AMD Ryzen 5 or equivalent)
  • Memory: 8GB RAM
  • Storage: 5GB free disk space (plus additional space for language models)
  • Microphone: Any built-in or external microphone
  • Internet: Required for initial setup and model downloads only

Recommended Specifications

  • Processor: 6+ core CPU (Intel i7/AMD Ryzen 7 or better)
  • Memory: 16GB RAM or more
  • Storage: 20GB+ free SSD space
  • Microphone: High-quality directional microphone
  • GPU: CUDA-compatible NVIDIA GPU (4GB+ VRAM) for accelerated LLM processing

Additional Software Dependencies

The following software is required or recommended for full functionality:

  • Python: Version 3.8 or higher
  • Node.js: Version 14 or higher
  • Git: Latest version
  • Terminal: Compatible with bash/zsh on macOS/Linux or PowerShell/Command Prompt on Windows
  • Development Tools: Any IDEs or text editors you commonly use (VS Code, IntelliJ, etc.)

Warning: Some features like offline language models may require significantly more system resources. For systems with limited resources, consider using the lightweight configuration which uses smaller models.

Voice Agent

The Voice Agent is the primary interface between you and VoxCMD. It handles speech recognition, wake word detection, and command parsing.

How It Works

The Voice Agent continuously listens for the wake word ("Hey VoxCMD" by default) and then processes the subsequent command using offline speech recognition models. Here's the process:

  1. Wake Word Detection: Listens for the wake word using a low-power neural network.
  2. Command Recording: Once the wake word is detected, records the following speech until a natural pause.
  3. Speech-to-Text: Converts the recorded audio to text using an offline speech recognition model.
  4. Command Parsing: Identifies the intent and parameters of the command.
  5. Handoff: Passes the parsed command to the Think Agent for further processing.

Voice Feedback

The Voice Agent provides audio feedback to confirm command reception and processing status. This includes:

  • Wake word acknowledgment tone
  • Command processing indication
  • Error notifications
  • Verbal responses when appropriate

Customizing the Voice Agent

You can customize various aspects of the Voice Agent through the settings menu or configuration file:

Setting Description Default Value
Wake Word Change the phrase that activates VoxCMD "Hey VoxCMD"
Activation Sensitivity Adjust how sensitive the wake word detection is Medium (0.7)
Voice Feedback Level Control how much audio feedback is provided Normal
Voice Gender Set the preferred gender for VoxCMD's voice responses Female
Speech Recognition Model Select which speech recognition model to use Standard (Medium)
Background Noise Filtering Enable noise filtering for noisy environments Enabled

To change these settings using voice commands:

Voice Command
"Hey VoxCMD, change wake word to 'Computer'"
Voice Command
"Hey VoxCMD, increase voice sensitivity"

Troubleshooting Voice Recognition

If you're experiencing issues with voice recognition:

  1. Recalibrate Microphone: Use "Hey VoxCMD, calibrate microphone" to adjust for your current environment.
  2. Voice Training: Run "Hey VoxCMD, retrain voice model" to improve recognition accuracy.
  3. Check Environment: Reduce background noise and ensure your microphone isn't obstructed.
  4. Adjust Sensitivity: If wake word detection is too sensitive or not sensitive enough, adjust the activation sensitivity.

Tip: For best results, use a dedicated microphone positioned 6-12 inches from your mouth, especially in noisy environments.

Configuration File

VoxCMD uses a configuration file to store settings and preferences. This file can be edited directly for advanced customization.

Location

The configuration file is located at:

Operating System Location
Windows %APPDATA%\VoxCMD\config.yaml
macOS ~/Library/Application Support/VoxCMD/config.yaml
Linux ~/.config/voxcmd/config.yaml

You can open the configuration file using:

Voice Command
"Hey VoxCMD, open configuration file"

File Format

The configuration file uses YAML format. Here's a sample configuration with common settings:

# VoxCMD Configuration File # General Settings general: language: "en-US" auto_start: true update_check: true telemetry: false # Voice Settings voice: wake_word: "Hey VoxCMD" sensitivity: 0.7 feedback_level: "normal" # options: minimal, normal, verbose voice_gender: "female" # options: male, female recognition_model: "medium" noise_filtering: true # LLM Settings llm: model_path: "/path/to/models/llama-7b" max_tokens: 2048 temperature: 0.7 offload_to_gpu: true # Paths and Integrations paths: projects_dir: "~/Projects" downloads_dir: "~/Downloads" templates_dir: "~/.voxcmd/templates" # Integrations integrations: vscode: enabled: true extensions_path: "~/.vscode/extensions" browser: enabled: true preferred: "chrome" calendar: enabled: true provider: "google" sync_interval: 15 # minutes # Custom Commands custom_commands: - name: "start dev server" action: "terminal" command: "npm run dev" working_dir: "~/current-project" - name: "deploy to production" action: "script" script_path: "~/.voxcmd/scripts/deploy.sh" confirm: true

Warning: Make a backup of your configuration file before making significant changes. Incorrect syntax can prevent VoxCMD from starting properly.

Common Configuration Tasks

Changing the Wake Word

To change the phrase that activates VoxCMD:

voice: wake_word: "Computer" # Change to your preferred wake word

Adding Custom Commands

Custom commands can be added to the custom_commands section:

custom_commands: - name: "build project" action: "terminal" command: "npm run build" working_dir: "~/Projects/current"

Configuring Project Directories

Set up your commonly used directories:

paths: projects_dir: "~/Documents/Projects" # Your main projects folder templates_dir: "~/Templates" # Templates for new projects

Tip: After modifying the configuration file, restart VoxCMD for changes to take effect, or use "Hey VoxCMD, reload configuration".

Troubleshooting

This section covers common issues and their solutions to help you troubleshoot problems with VoxCMD.

Common Issues

VoxCMD Doesn't Respond to Voice Commands

Possible causes:

  • Microphone access permissions not granted
  • Incorrect microphone selected
  • Background noise interfering with wake word detection
  • Voice service not running

Solutions:

  1. Check microphone permissions in system settings
  2. Open VoxCMD settings and select the correct microphone
  3. Run the microphone calibration: "Hey VoxCMD, calibrate microphone"
  4. Restart the voice service: Right-click the system tray icon and select "Restart Voice Service"

VoxCMD Crashes on Startup

Possible causes:

  • Corrupted configuration file
  • Missing or incompatible dependencies
  • Insufficient system resources
  • Conflicting software

Solutions:

  1. Restore the default configuration: Run voxcmd --reset-config in terminal
  2. Check logs for specific errors: voxcmd --debug
  3. Reinstall VoxCMD with the latest version
  4. Check system resources and close resource-intensive applications

High CPU/Memory Usage

Possible causes:

  • Large language model loaded into memory
  • Multiple agents running simultaneously
  • Debug mode enabled

Solutions:

  1. Switch to a smaller language model in settings
  2. Disable unused agents in the configuration file
  3. Enable model offloading if you have a compatible GPU
  4. Use "Hey VoxCMD, enable low resource mode" for temporary relief

Diagnostic Tools

VoxCMD includes several diagnostic tools to help identify and resolve issues:

Tool Description How to Access
System Check Verifies all components are working correctly "Hey VoxCMD, run system check" or voxcmd --check
Voice Test Tests microphone and speech recognition "Hey VoxCMD, test voice recognition" or voxcmd --test-voice
Log Viewer Displays detailed logs for troubleshooting "Hey VoxCMD, show logs" or voxcmd --logs
Hardware Monitor Shows resource usage and hardware compatibility "Hey VoxCMD, show hardware status" or voxcmd --hardware
Network Diagnostics Checks connectivity for integrations "Hey VoxCMD, test network" or voxcmd --network

Checking Logs

Log files can provide detailed information about errors and issues. VoxCMD logs are stored in:

Operating System Log Location
Windows %APPDATA%\VoxCMD\logs\
macOS ~/Library/Logs/VoxCMD/
Linux ~/.local/share/voxcmd/logs/

To enable more detailed logging for troubleshooting:

# Run VoxCMD with debug logging enabled voxcmd --debug # Or modify the config.yaml file: logging: level: "debug" # options: info, debug, trace file_rotation: 7 # days to keep logs

Reporting Issues

If you encounter issues that you can't resolve, you can report them to the VoxCMD team:

  1. GitHub Issues: Submit a detailed bug report on our GitHub repository
  2. Community Forum: Ask for help on the VoxCMD Community Forum
  3. Export Diagnostics: Run voxcmd --export-diagnostics to generate a diagnostic report to include with your issue

Tip: When reporting issues, always include your operating system, VoxCMD version, and the steps to reproduce the problem. Screenshots or screen recordings can be very helpful.

Frequently Asked Questions

General Questions

Is VoxCMD completely offline?

Yes, once installed and configured, VoxCMD operates entirely offline. All voice processing, language understanding, and command execution happen locally on your machine. Internet connectivity is only required for initial setup, model downloads, and optional cloud integrations.

Can I use VoxCMD with other programming languages?

Yes! VoxCMD supports a wide range of programming languages and development environments. The core features work with any language, and there are specialized commands for popular languages like JavaScript, Python, Java, C#, Go, and more. You can also extend VoxCMD with custom commands for your specific development stack.

How much system resources does VoxCMD use?

Resource usage depends on your configuration and which features you're using. With default settings, VoxCMD uses approximately:

  • 200-300MB RAM for the base application
  • Additional 1-4GB for language models (depending on model size)
  • 5-15% CPU when processing voice commands
  • Minimal CPU usage when idle (wake word detection mode)

You can adjust resource usage in settings by choosing smaller models or enabling resource-saving options.

Voice Recognition

Does VoxCMD work with accents?

Yes, VoxCMD's voice recognition system is designed to work with a variety of accents. For best results, complete the voice training process during setup, which helps the system adapt to your specific accent and speech patterns. If you experience issues, you can run additional voice training with the command "Hey VoxCMD, improve voice recognition" to further enhance accuracy.

Can I change the wake word?

Yes, you can customize the wake word in settings. The default is "Hey VoxCMD," but you can change it to any phrase between 2-5 words long. Common alternatives include "Computer," "Assistant," or "Hey Dev." To change it, use the command "Hey VoxCMD, change wake word to [your preferred phrase]" or modify the configuration file directly.

How does VoxCMD handle noisy environments?

VoxCMD includes noise filtering technology to improve voice recognition in moderately noisy environments. For optimal performance in very noisy settings:

  • Use a directional microphone positioned close to your mouth
  • Enable enhanced noise filtering in settings
  • Increase the wake word sensitivity
  • Consider using the push-to-talk mode (keyboard shortcut) in extremely noisy environments

Development Features

Can VoxCMD integrate with my IDE?

Yes, VoxCMD offers direct integration with popular IDEs and code editors including:

  • Visual Studio Code (via the official VoxCMD extension)
  • JetBrains IDEs (IntelliJ, PyCharm, WebStorm, etc.)
  • Visual Studio
  • Sublime Text
  • Atom

These integrations allow for more powerful code navigation, generation, and editing capabilities directly within your development environment.

How does code generation work?

VoxCMD's code generation is powered by offline language models that understand programming concepts and syntax. When you request code generation, the LLM Agent processes your request, considers the context (current file, project structure, etc.), and generates appropriate code. The generator supports various programming languages and frameworks, and can create everything from simple functions to complete components or modules.

Can VoxCMD help with debugging?

Yes, VoxCMD includes a Debug Helper that can assist with troubleshooting code issues. It can:

  • Analyze error messages and suggest solutions
  • Explain what specific code does
  • Identify potential bugs or performance issues
  • Generate test cases for specific functions
  • Explain complex code sections in simple terms

Use commands like "Hey VoxCMD, explain this error" or "Hey VoxCMD, help debug this function" while your cursor is on the relevant code.

Privacy and Security

Does VoxCMD send my voice data to the cloud?

No. VoxCMD processes all voice data locally on your device. Your voice recordings, commands, and any code or personal information are never sent to external servers. This ensures complete privacy for sensitive development work and personal commands.

Is telemetry enabled by default?

Basic anonymous usage telemetry is enabled by default, but it only collects non-identifying information like feature usage statistics and error reports to help improve the product. No personal information, voice data, or command content is collected. You can disable telemetry completely during installation or any time in settings with "Hey VoxCMD, disable telemetry" or by editing the configuration file.

Can VoxCMD access sensitive information?

VoxCMD only accesses information you explicitly grant it permission to use. By default, it has access to:

  • Microphone for voice commands
  • File system access for project management
  • Application control for launching/managing programs

Additional permissions (calendar access, email, etc.) are opt-in and can be managed in the permissions settings. All data is processed locally and never leaves your device unless you specifically configure cloud integrations.

Community and Support

Join the VoxCMD community to get help, share ideas, and stay updated on the latest developments.

Community Resources

GitHub Repository

Our GitHub repository is the central hub for VoxCMD development. Here you can:

  • Report bugs and request features
  • Contribute to the codebase
  • Access development builds and beta features
  • View the project roadmap

Visit the VoxCMD GitHub Repository

Discord Community

Join our Discord server to connect with other VoxCMD users and the development team. The Discord community offers:

  • Real-time help and troubleshooting
  • Command sharing and tips
  • Development updates and announcements
  • Showcase of cool projects using VoxCMD

Join the VoxCMD Discord Server

Community Forum

Our community forum is a great place for in-depth discussions, tutorials, and knowledge sharing. The forum includes:

  • Detailed tutorials and guides
  • Advanced configuration examples
  • Integration with other tools and services
  • Community plugin and extension sharing

Visit the VoxCMD Community Forum

Getting Help

If you need assistance with VoxCMD, here are the best ways to get help:

  1. In-App Help: Use "Hey VoxCMD, help me with [feature]" for built-in assistance
  2. Documentation: This documentation covers most features and common issues
  3. Discord Community: Ask questions in the #help channel for quick responses
  4. GitHub Issues: Report bugs or request features on our GitHub repository
  5. Email Support: For premium users, contact support@voxcmd.dev

Tip: When asking for help, provide as much detail as possible about your issue, including your operating system, VoxCMD version, and any error messages you've received. This helps the community provide more accurate assistance.

Contributing

VoxCMD is an open-source project, and we welcome contributions from the community. Here's how you can contribute:

  • Code Contributions: Submit pull requests on GitHub for bug fixes or new features
  • Documentation: Help improve or translate the documentation
  • Testing: Participate in beta testing new features and releases
  • Command Sharing: Share your custom commands and configurations with the community
  • Bug Reports: Report bugs and help verify fixes
  • Feature Requests: Suggest new features or improvements

For more information on contributing, see our Contribution Guidelines on GitHub.