LSB Audio Steganography
Interactive demonstration of Least Significant Bit (LSB) steganography for embedding and extracting hidden messages in audio signals. Implements the classic LSB algorithm for covert communication in digital audio.
Initialization • Click 'Initialize Python Runtime' to load Pyodide and NumPy • Wait for status to show 'Ready' before proceeding Execution • Steganography analysis runs automatically after initialization • Click 'Re-run Steganography' to execute LSB embedding and extraction again • View real-time updates in the status indicator Layout • Python Implementation Section: View complete syntax-highlighted code (5 algorithm steps) • Analysis Results Section: Interactive output with embedding results, SNR metrics, and LSB distribution charts
import numpy as np
import json
def text_to_binary(message):
"""Convert text message to binary string (8 bits per character)."""
binary = ''.join(format(ord(char), '08b') for char in message)
return binary
def binary_to_text(binary_string):
"""Convert binary string back to text."""
chars = []
for i in range(0, len(binary_string), 8):
byte = binary_string[i:i+8]
if len(byte) == 8:
chars.append(chr(int(byte, 2)))
return ''.join(chars)
def embed_message_lsb(audio_samples, message):
"""
Embed message into audio using LSB steganography.
Returns modified audio samples and metadata.
"""
# Convert message to binary
message_binary = text_to_binary(message)
message_length = len(message_binary)
# Check if audio has enough samples
if message_length > len(audio_samples):
raise ValueError(f"Message too long. Need {message_length} samples, have {len(audio_samples)}")
# Create copy of audio
stego_audio = audio_samples.copy()
# Embed message bits into LSB of audio samples
for i in range(message_length):
# Get the audio sample as integer
sample = int(stego_audio[i])
# Get message bit
message_bit = int(message_binary[i])
# Clear LSB and set to message bit
sample = (sample & ~1) | message_bit
stego_audio[i] = sample
return stego_audio, message_length
def extract_message_lsb(stego_audio, message_length):
"""
Extract hidden message from audio using LSB steganography.
"""
# Extract LSBs from required number of samples
binary_message = ''
for i in range(message_length):
sample = int(stego_audio[i])
lsb = sample & 1
binary_message += str(lsb)
# Convert binary to text
message = binary_to_text(binary_message)
return message
def calculate_embedding_stats(original_audio, stego_audio, message_length):
"""Calculate statistics about the embedding process."""
# Calculate SNR (Signal-to-Noise Ratio)
diff = original_audio[:message_length] - stego_audio[:message_length]
signal_power = np.mean(original_audio[:message_length].astype(float) ** 2)
noise_power = np.mean(diff.astype(float) ** 2)
# Handle edge cases for SNR calculation
if noise_power > 0 and signal_power > 0:
snr_db = 10 * np.log10(signal_power / noise_power)
# Cap at reasonable max value
if np.isinf(snr_db) or snr_db > 200:
snr_db = 200.0
elif np.isnan(snr_db):
snr_db = 0.0
else:
snr_db = 200.0 # Perfect embedding (no noise)
# Calculate capacity and usage
total_capacity_bits = len(original_audio)
used_bits = message_length
usage_percent = (used_bits / total_capacity_bits) * 100
# Count modified samples
modified_samples = np.sum(original_audio[:message_length] != stego_audio[:message_length])
modification_rate = (modified_samples / message_length) * 100 if message_length > 0 else 0.0
return {
'snr_db': float(snr_db),
'capacity_bits': int(total_capacity_bits),
'used_bits': int(used_bits),
'usage_percent': float(usage_percent),
'modified_samples': int(modified_samples),
'modification_rate': float(modification_rate)
}
def analyze_bit_distribution(stego_audio, message_length):
"""Analyze the distribution of LSBs in the stego audio."""
lsbs = []
for i in range(message_length):
sample = int(stego_audio[i])
lsb = sample & 1
lsbs.append(lsb)
ones = sum(lsbs)
zeros = len(lsbs) - ones
return {
'ones': int(ones),
'zeros': int(zeros),
'ratio': float(ones / len(lsbs)) if len(lsbs) > 0 else 0
}
# Demo: Create synthetic audio signal
sample_rate = 16000
duration = 2.0 # seconds
num_samples = int(sample_rate * duration)
# Generate simple sine wave as carrier audio
frequency = 440 # A4 note
t = np.linspace(0, duration, num_samples, endpoint=False)
original_audio = np.sin(2 * np.pi * frequency * t)
# Scale to 16-bit integer range
original_audio = (original_audio * 32767).astype(np.int16)
# Secret message to embed
secret_message = "It always seems impossible until it's done"
# EMBEDDING PHASE
stego_audio, message_length = embed_message_lsb(original_audio, secret_message)
# EXTRACTION PHASE
extracted_message = extract_message_lsb(stego_audio, message_length)
# ANALYSIS
stats = calculate_embedding_stats(original_audio, stego_audio, message_length)
bit_dist = analyze_bit_distribution(stego_audio, message_length)
# Calculate sample differences for visualization
sample_indices = list(range(min(100, message_length)))
original_samples = [int(original_audio[i]) for i in sample_indices]
stego_samples = [int(stego_audio[i]) for i in sample_indices]
differences = [int(stego_audio[i] - original_audio[i]) for i in sample_indices]
# Prepare results
results = {
'original_message': secret_message,
'extracted_message': extracted_message,
'message_length_chars': len(secret_message),
'message_length_bits': message_length,
'match': extracted_message == secret_message,
'stats': stats,
'bit_distribution': bit_dist,
'sample_data': {
'indices': sample_indices,
'original': original_samples,
'stego': stego_samples,
'differences': differences
}
}
print("RESULTS_JSON:" + json.dumps(results))
Need AI Engineering?
From prototypes to production-grade systems.
The Art of Invisible Ink
Encryption hides the content of a message. Steganography hides the existence of the message itself. This lab implements Least Significant Bit (LSB) encoding—a digital equivalent of writing in invisible ink.
How It Works
Digital audio is stored as a series of 16-bit numbers (samples).
- Original Sample:
10101100 10101100(Amplitude: 44200) - Modified Sample:
10101100 10101101(Amplitude: 44201)
Changing the last bit changes the volume by 1/65536th. This microscopic change is mathematically retrievable but biologically inaudible (SNR > 60dB).
The Trade-off: Fragility vs. Capacity
LSB is high-capacity (up to ~5kb/s) and computationally instant. However, it is extremely fragile.
- Weakness: Any compression (MP3, AAC) or volume change destroys the secret bits.
- Detection: While invisible to the ear, it leaves a statistical fingerprint. In the “Output Analysis” panel, watch the Bit Distribution chart. Natural audio has uneven bit distribution; encrypted messages look like perfect static (50/50 distribution of 0s and 1s).
Modern Context
While LSB is the classic educational example, modern military-grade steganography uses Deep Learning to hide data in the frequency domain (spectrograms), creating changes that survive MP3 compression and are harder to detect statistically.
References
[1] J. Ros, M. Geleta, J. Pons, and X. Giro-i-Nieto, “Towards Robust Image-in-Audio Deep Steganography,” arXiv:2303.05007 [cs.CR], Mar. 2023. Available: https://arxiv.org/abs/2303.05007