Documentation | VoxCMD

Introduction v1.0.0

Welcome to VoxCMD, the powerful offline multi-agent voice command system designed to revolutionize the way developers and power users interact with their computers. This documentation will help you get started with VoxCMD and explore its many features.

What is VoxCMD?

VoxCMD is a comprehensive voice control system that operates entirely offline, ensuring privacy and security while providing advanced automation capabilities. By leveraging multiple specialized agents, VoxCMD can understand complex commands, execute system operations, assist with development tasks, and enhance productivity.

Privacy-Focused Design

Unlike many voice assistants, VoxCMD runs completely on your local machine without sending your voice data to external servers. This ensures your commands, code, and personal information remain private.

Key Features

Offline Operation: All processing happens locally on your machine.
Multi-Agent Architecture: Specialized agents handle different aspects of voice command processing.
Developer Tools: Code generation, project setup, debugging assistance, and more.
Productivity Suite: Calendar management, reminders, focus modes, and other productivity features.
System Automation: Launch applications, find files, monitor system resources.
Extensible Design: Create custom commands and integrate with other tools.

How VoxCMD Works

VoxCMD uses a multi-agent approach to process and execute voice commands:

Voice Agent: Converts speech to text and identifies command patterns.
Think Agent: Analyzes the command to determine intent and required actions.
LLM Agent: Leverages local language models to understand complex requests and generate responses.
Memory System: Maintains context and remembers preferences for personalized interactions.

Note: VoxCMD requires initial setup and configuration to work optimally with your system and workflow. Follow the installation and quick start guides to get started.

Installation

Follow these steps to install VoxCMD on your system. The installation process varies slightly depending on your operating system.

Prerequisites

Before installing VoxCMD, ensure your system meets the minimum requirements and has the following prerequisites installed:

Python 3.8 or higher
Node.js 14 or higher
Git
C++ build tools (for native dependencies)

Windows Installation

1

Download the Installer

Download the latest Windows installer from the official website.

2

Run the Installer

Double-click the downloaded .exe file and follow the installation wizard. Allow administrator permissions when prompted.

3

Install Required Models

After installation, VoxCMD will prompt you to download the necessary language and voice recognition models. This may take some time depending on your internet connection.

4

Configuration

Run the initial configuration wizard to set up VoxCMD according to your preferences.

Alternative: Using PowerShell

                            # Install using PowerShell
                            iwr -useb https://install.voxcmd.dev/win | iex
                        

macOS Installation

1

Download the DMG

Download the latest macOS DMG file from the official website.

2

Install the Application

Open the DMG file and drag VoxCMD to your Applications folder.

3

Grant Permissions

Launch VoxCMD and grant the required permissions for microphone access, accessibility features, and file system access when prompted.

4

Install Required Models

Follow the prompts to download and install the necessary language and voice recognition models.

Alternative: Using Homebrew

                            # Install using Homebrew
                            brew install --cask voxcmd
                        

Linux Installation

1

Install Dependencies

First, install the required dependencies using your distribution's package manager:

                                            # For Debian/Ubuntu
                                            sudo apt update
                                            sudo apt install python3 python3-pip nodejs npm git build-essential libportaudio2 libasound2-dev

                                            # For Fedora
                                            sudo dnf install python3 python3-pip nodejs npm git gcc-c++ portaudio-devel
                                        

2

Clone the Repository

                                            git clone https://github.com/voxcmd/voxcmd.git
                                            cd voxcmd
                                        

3

Run the Installation Script

./install.sh

4

Download Models and Configure

Follow the prompts to download models and configure the application.

Tip: For better microphone integration on Linux, consider installing PulseAudio or PipeWire if they're not already on your system.

Quick Start Guide

After installing VoxCMD, follow this quick start guide to get up and running with basic voice commands.

Launching VoxCMD

Start VoxCMD from your applications menu or by running the following command in your terminal:

                            # Start VoxCMD
                            voxcmd start
                        

Once launched, VoxCMD will run in the background, indicated by an icon in your system tray or menu bar.

Basic Voice Commands

VoxCMD uses "Hey VoxCMD" as the default wake word. Here are some basic commands to get started:

Command	Action
"Hey VoxCMD, what can you do?"	Lists available commands and capabilities
"Hey VoxCMD, open Chrome"	Launches Google Chrome browser
"Hey VoxCMD, create a new folder called Projects"	Creates a new directory named "Projects"
"Hey VoxCMD, what time is it?"	Announces the current time
"Hey VoxCMD, create a new React project"	Initializes a new React application
"Hey VoxCMD, remind me to check emails in 1 hour"	Sets a reminder for 1 hour from now
"Hey VoxCMD, start focus mode for 25 minutes"	Activates focus mode for a Pomodoro session

Command Structure

Most VoxCMD commands follow this general structure:

Wake Word

"Hey VoxCMD"

Action

"open", "create", "find", "remind", etc.

Object

"Chrome", "a new folder", "my calendar", etc.

Parameters (Optional)

"called Projects", "for 30 minutes", "with TypeScript", etc.

First-Time Setup

When you first launch VoxCMD, the setup wizard will guide you through the following steps:

Voice Training: Train VoxCMD to recognize your voice by reading a few sample phrases.
Environment Selection: Choose which development environments and tools you use most frequently.
Integration Setup: Configure integrations with your calendar, task management systems, and development tools.
Command Customization: Personalize your most frequently used commands.

Note: You can revisit this setup process at any time by saying "Hey VoxCMD, open settings" or using the system tray/menu bar icon.

System Requirements

To ensure optimal performance, your system should meet or exceed the following requirements:

Minimum Requirements

Operating System:
- Windows 10 (64-bit) or later
- macOS 11 (Big Sur) or later
- Ubuntu 20.04, Fedora 34, or other modern Linux distributions
Processor: Quad-core CPU (Intel i5/AMD Ryzen 5 or equivalent)
Memory: 8GB RAM
Storage: 5GB free disk space (plus additional space for language models)
Microphone: Any built-in or external microphone
Internet: Required for initial setup and model downloads only

Recommended Specifications

Processor: 6+ core CPU (Intel i7/AMD Ryzen 7 or better)
Memory: 16GB RAM or more
Storage: 20GB+ free SSD space
Microphone: High-quality directional microphone
GPU: CUDA-compatible NVIDIA GPU (4GB+ VRAM) for accelerated LLM processing

Additional Software Dependencies

The following software is required or recommended for full functionality:

Python: Version 3.8 or higher
Node.js: Version 14 or higher
Git: Latest version
Terminal: Compatible with bash/zsh on macOS/Linux or PowerShell/Command Prompt on Windows
Development Tools: Any IDEs or text editors you commonly use (VS Code, IntelliJ, etc.)

Warning: Some features like offline language models may require significantly more system resources. For systems with limited resources, consider using the lightweight configuration which uses smaller models.

Voice Agent

The Voice Agent is the primary interface between you and VoxCMD. It handles speech recognition, wake word detection, and command parsing.

How It Works

The Voice Agent continuously listens for the wake word ("Hey VoxCMD" by default) and then processes the subsequent command using offline speech recognition models. Here's the process:

Wake Word Detection: Listens for the wake word using a low-power neural network.
Command Recording: Once the wake word is detected, records the following speech until a natural pause.
Speech-to-Text: Converts the recorded audio to text using an offline speech recognition model.
Command Parsing: Identifies the intent and parameters of the command.
Handoff: Passes the parsed command to the Think Agent for further processing.

Voice Feedback

The Voice Agent provides audio feedback to confirm command reception and processing status. This includes:

Wake word acknowledgment tone
Command processing indication
Error notifications
Verbal responses when appropriate

Customizing the Voice Agent

You can customize various aspects of the Voice Agent through the settings menu or configuration file:

Setting	Description	Default Value
Wake Word	Change the phrase that activates VoxCMD	"Hey VoxCMD"
Activation Sensitivity	Adjust how sensitive the wake word detection is	Medium (0.7)
Voice Feedback Level	Control how much audio feedback is provided	Normal
Voice Gender	Set the preferred gender for VoxCMD's voice responses	Female
Speech Recognition Model	Select which speech recognition model to use	Standard (Medium)
Background Noise Filtering	Enable noise filtering for noisy environments	Enabled

To change these settings using voice commands:

Voice Command

"Hey VoxCMD, change wake word to 'Computer'"

Voice Command

"Hey VoxCMD, increase voice sensitivity"

Troubleshooting Voice Recognition

If you're experiencing issues with voice recognition:

Recalibrate Microphone: Use "Hey VoxCMD, calibrate microphone" to adjust for your current environment.
Voice Training: Run "Hey VoxCMD, retrain voice model" to improve recognition accuracy.
Check Environment: Reduce background noise and ensure your microphone isn't obstructed.
Adjust Sensitivity: If wake word detection is too sensitive or not sensitive enough, adjust the activation sensitivity.

Tip: For best results, use a dedicated microphone positioned 6-12 inches from your mouth, especially in noisy environments.

Configuration File

VoxCMD uses a configuration file to store settings and preferences. This file can be edited directly for advanced customization.

Location

The configuration file is located at:

Operating System	Location
Windows	`%APPDATA%\VoxCMD\config.yaml`
macOS	`~/Library/Application Support/VoxCMD/config.yaml`
Linux	`~/.config/voxcmd/config.yaml`

You can open the configuration file using:

Voice Command

"Hey VoxCMD, open configuration file"

File Format

The configuration file uses YAML format. Here's a sample configuration with common settings:

                            # VoxCMD Configuration File
                            
                            # General Settings
                            general:
                              language: "en-US"
                              auto_start: true
                              update_check: true
                              telemetry: false
                            
                            # Voice Settings
                            voice:
                              wake_word: "Hey VoxCMD"
                              sensitivity: 0.7
                              feedback_level: "normal"  # options: minimal, normal, verbose
                              voice_gender: "female"    # options: male, female
                              recognition_model: "medium"
                              noise_filtering: true
                            
                            # LLM Settings
                            llm:
                              model_path: "/path/to/models/llama-7b"
                              max_tokens: 2048
                              temperature: 0.7
                              offload_to_gpu: true
                            
                            # Paths and Integrations
                            paths:
                              projects_dir: "~/Projects"
                              downloads_dir: "~/Downloads"
                              templates_dir: "~/.voxcmd/templates"
                            
                            # Integrations
                            integrations:
                              vscode:
                                enabled: true
                                extensions_path: "~/.vscode/extensions"
                              browser:
                                enabled: true
                                preferred: "chrome"
                              calendar:
                                enabled: true
                                provider: "google"
                                sync_interval: 15  # minutes
                            
                            # Custom Commands
                            custom_commands:
                              - name: "start dev server"
                                action: "terminal"
                                command: "npm run dev"
                                working_dir: "~/current-project"
                              - name: "deploy to production"
                                action: "script"
                                script_path: "~/.voxcmd/scripts/deploy.sh"
                                confirm: true
                        

Warning: Make a backup of your configuration file before making significant changes. Incorrect syntax can prevent VoxCMD from starting properly.

Common Configuration Tasks

Changing the Wake Word

To change the phrase that activates VoxCMD:

                            voice:
                              wake_word: "Computer"  # Change to your preferred wake word
                        

Adding Custom Commands

Custom commands can be added to the custom_commands section:

                            custom_commands:
                              - name: "build project"
                                action: "terminal"
                                command: "npm run build"
                                working_dir: "~/Projects/current"
                        

Configuring Project Directories

Set up your commonly used directories:

                            paths:
                              projects_dir: "~/Documents/Projects"  # Your main projects folder
                              templates_dir: "~/Templates"          # Templates for new projects
                        

Tip: After modifying the configuration file, restart VoxCMD for changes to take effect, or use "Hey VoxCMD, reload configuration".

Troubleshooting

This section covers common issues and their solutions to help you troubleshoot problems with VoxCMD.

Common Issues

VoxCMD Doesn't Respond to Voice Commands

Possible causes:

Microphone access permissions not granted
Incorrect microphone selected
Background noise interfering with wake word detection
Voice service not running

Solutions:

Check microphone permissions in system settings
Open VoxCMD settings and select the correct microphone
Run the microphone calibration: "Hey VoxCMD, calibrate microphone"
Restart the voice service: Right-click the system tray icon and select "Restart Voice Service"

VoxCMD Crashes on Startup

Possible causes:

Corrupted configuration file
Missing or incompatible dependencies
Insufficient system resources
Conflicting software

Solutions:

Restore the default configuration: Run voxcmd --reset-config in terminal
Check logs for specific errors: voxcmd --debug
Reinstall VoxCMD with the latest version
Check system resources and close resource-intensive applications

High CPU/Memory Usage

Possible causes:

Large language model loaded into memory
Multiple agents running simultaneously
Debug mode enabled

Solutions:

Switch to a smaller language model in settings
Disable unused agents in the configuration file
Enable model offloading if you have a compatible GPU
Use "Hey VoxCMD, enable low resource mode" for temporary relief

Diagnostic Tools

VoxCMD includes several diagnostic tools to help identify and resolve issues:

Tool	Description	How to Access
System Check	Verifies all components are working correctly	"Hey VoxCMD, run system check" or `voxcmd --check`
Voice Test	Tests microphone and speech recognition	"Hey VoxCMD, test voice recognition" or `voxcmd --test-voice`
Log Viewer	Displays detailed logs for troubleshooting	"Hey VoxCMD, show logs" or `voxcmd --logs`
Hardware Monitor	Shows resource usage and hardware compatibility	"Hey VoxCMD, show hardware status" or `voxcmd --hardware`
Network Diagnostics	Checks connectivity for integrations	"Hey VoxCMD, test network" or `voxcmd --network`

Checking Logs

Log files can provide detailed information about errors and issues. VoxCMD logs are stored in:

Operating System	Log Location
Windows	`%APPDATA%\VoxCMD\logs\`
macOS	`~/Library/Logs/VoxCMD/`
Linux	`~/.local/share/voxcmd/logs/`

To enable more detailed logging for troubleshooting:

                            # Run VoxCMD with debug logging enabled
                            voxcmd --debug
                            
                            # Or modify the config.yaml file:
                            logging:
                              level: "debug"  # options: info, debug, trace
                              file_rotation: 7  # days to keep logs
                        

Reporting Issues

If you encounter issues that you can't resolve, you can report them to the VoxCMD team:

GitHub Issues: Submit a detailed bug report on our GitHub repository
Community Forum: Ask for help on the VoxCMD Community Forum
Export Diagnostics: Run voxcmd --export-diagnostics to generate a diagnostic report to include with your issue

Tip: When reporting issues, always include your operating system, VoxCMD version, and the steps to reproduce the problem. Screenshots or screen recordings can be very helpful.

Frequently Asked Questions

General Questions

Is VoxCMD completely offline?

Yes, once installed and configured, VoxCMD operates entirely offline. All voice processing, language understanding, and command execution happen locally on your machine. Internet connectivity is only required for initial setup, model downloads, and optional cloud integrations.

Can I use VoxCMD with other programming languages?

Yes! VoxCMD supports a wide range of programming languages and development environments. The core features work with any language, and there are specialized commands for popular languages like JavaScript, Python, Java, C#, Go, and more. You can also extend VoxCMD with custom commands for your specific development stack.

How much system resources does VoxCMD use?

Resource usage depends on your configuration and which features you're using. With default settings, VoxCMD uses approximately:

200-300MB RAM for the base application
Additional 1-4GB for language models (depending on model size)
5-15% CPU when processing voice commands
Minimal CPU usage when idle (wake word detection mode)

You can adjust resource usage in settings by choosing smaller models or enabling resource-saving options.

Voice Recognition

Does VoxCMD work with accents?

Yes, VoxCMD's voice recognition system is designed to work with a variety of accents. For best results, complete the voice training process during setup, which helps the system adapt to your specific accent and speech patterns. If you experience issues, you can run additional voice training with the command "Hey VoxCMD, improve voice recognition" to further enhance accuracy.

Can I change the wake word?

Yes, you can customize the wake word in settings. The default is "Hey VoxCMD," but you can change it to any phrase between 2-5 words long. Common alternatives include "Computer," "Assistant," or "Hey Dev." To change it, use the command "Hey VoxCMD, change wake word to [your preferred phrase]" or modify the configuration file directly.

How does VoxCMD handle noisy environments?

VoxCMD includes noise filtering technology to improve voice recognition in moderately noisy environments. For optimal performance in very noisy settings:

Use a directional microphone positioned close to your mouth
Enable enhanced noise filtering in settings
Increase the wake word sensitivity
Consider using the push-to-talk mode (keyboard shortcut) in extremely noisy environments

Development Features

Can VoxCMD integrate with my IDE?

Yes, VoxCMD offers direct integration with popular IDEs and code editors including:

Visual Studio Code (via the official VoxCMD extension)
JetBrains IDEs (IntelliJ, PyCharm, WebStorm, etc.)
Visual Studio
Sublime Text
Atom

These integrations allow for more powerful code navigation, generation, and editing capabilities directly within your development environment.

How does code generation work?

VoxCMD's code generation is powered by offline language models that understand programming concepts and syntax. When you request code generation, the LLM Agent processes your request, considers the context (current file, project structure, etc.), and generates appropriate code. The generator supports various programming languages and frameworks, and can create everything from simple functions to complete components or modules.

Can VoxCMD help with debugging?

Yes, VoxCMD includes a Debug Helper that can assist with troubleshooting code issues. It can:

Analyze error messages and suggest solutions
Explain what specific code does
Identify potential bugs or performance issues
Generate test cases for specific functions
Explain complex code sections in simple terms

Use commands like "Hey VoxCMD, explain this error" or "Hey VoxCMD, help debug this function" while your cursor is on the relevant code.

Privacy and Security

Does VoxCMD send my voice data to the cloud?

No. VoxCMD processes all voice data locally on your device. Your voice recordings, commands, and any code or personal information are never sent to external servers. This ensures complete privacy for sensitive development work and personal commands.

Is telemetry enabled by default?

Basic anonymous usage telemetry is enabled by default, but it only collects non-identifying information like feature usage statistics and error reports to help improve the product. No personal information, voice data, or command content is collected. You can disable telemetry completely during installation or any time in settings with "Hey VoxCMD, disable telemetry" or by editing the configuration file.

Can VoxCMD access sensitive information?

VoxCMD only accesses information you explicitly grant it permission to use. By default, it has access to:

Microphone for voice commands
File system access for project management
Application control for launching/managing programs

Additional permissions (calendar access, email, etc.) are opt-in and can be managed in the permissions settings. All data is processed locally and never leaves your device unless you specifically configure cloud integrations.

Community and Support

Join the VoxCMD community to get help, share ideas, and stay updated on the latest developments.

Community Resources

GitHub Repository

Our GitHub repository is the central hub for VoxCMD development. Here you can:

Report bugs and request features
Contribute to the codebase
Access development builds and beta features
View the project roadmap

Visit the VoxCMD GitHub Repository

Discord Community

Join our Discord server to connect with other VoxCMD users and the development team. The Discord community offers:

Real-time help and troubleshooting
Command sharing and tips
Development updates and announcements
Showcase of cool projects using VoxCMD

Join the VoxCMD Discord Server

Community Forum

Our community forum is a great place for in-depth discussions, tutorials, and knowledge sharing. The forum includes:

Detailed tutorials and guides
Advanced configuration examples
Integration with other tools and services
Community plugin and extension sharing

Visit the VoxCMD Community Forum

Getting Help

If you need assistance with VoxCMD, here are the best ways to get help:

In-App Help: Use "Hey VoxCMD, help me with [feature]" for built-in assistance
Documentation: This documentation covers most features and common issues
Discord Community: Ask questions in the #help channel for quick responses
GitHub Issues: Report bugs or request features on our GitHub repository
Email Support: For premium users, contact support@voxcmd.dev

Tip: When asking for help, provide as much detail as possible about your issue, including your operating system, VoxCMD version, and any error messages you've received. This helps the community provide more accurate assistance.

Contributing

VoxCMD is an open-source project, and we welcome contributions from the community. Here's how you can contribute:

Code Contributions: Submit pull requests on GitHub for bug fixes or new features
Documentation: Help improve or translate the documentation
Testing: Participate in beta testing new features and releases
Command Sharing: Share your custom commands and configurations with the community
Bug Reports: Report bugs and help verify fixes
Feature Requests: Suggest new features or improvements

For more information on contributing, see our Contribution Guidelines on GitHub.