callGPT - ChatGPT in a rotary dial telephone

Retro Calling: ChatGPT on Your Rotary Phone

In this project, you’ll convert a telephone so that you can call ChatGPT with it: You dial a number with the rotary dial, and you an ChatGPT start having a phone conversation. Not just a single question and an answer from the AI โ€“ but a real conversation where ChatGPT remembers what you said previously during the call.

A Python script runs on the Raspberry Pi, controlled by a mechanism under the telephone hook, and manages the conversation.

For this project, I used a FeTAp (Fernsprechtischapparat) 791 with a rotary dial from Germany. This telephone offers plenty of space inside: The Raspberry Pi can be elegantly stored under the rotary dial, the lavalier microphone finds its place in front of an opening in the housing, and all cables can be arranged so that the phone can be closed again without any problems.

I don’t know which country you live in โ€“ but I think rotary dial telephones existed worldwide until the 1990s. ๐Ÿ™‚ You’ll surely find a suitable one in your local classifieds.

Also, no original parts were modified or damaged โ€“ so if ChatGPT should ever become history, the phone can be converted back in a few minutes. This is what the inside looks like after installation:

Rotary Dial Telephone with Raspberry Pi inside

Listen to our Google-powered podcast episode to get to know this project:

Components You’ll Need

  • Rotary dial telephone
  • Raspberry Pi (model 4 should be sufficient)
  • Power supply
  • Lavalier microphone
  • 2.8 mm flat connectors
  • Button
  • Jumper cables
  • 3.5 mm audio cable

You’ll also need a soldering iron to connect the flat connectors and the button.

Setting Up the Hardware

Besides the Raspberry Pi, you’ll need some components that you can either connect easily via USB or modify/create in a more complex way. When fully assembled, the additional innards of the phone look like this (although the wires to the rotary dial are missing in this picture):

the assembled hardware components

On the right side, you can see the lavalier microphone. At the bottom of the Raspberry Pi’s pin header is a button soldered to two cables. At the line output (top), there’s a jack plug with a cable ending in two flat connectors. But let’s go through it one by one.

The Microphone

The easiest way to get your voice into the Raspberry Pi is with a USB microphone. In the photo above, you can see the Sennheiser XS-Lav USB-C lavalier microphone. Since the Raspberry only provides USB-A, there’s an adapter between the microphone and the USB port. Make sure to use an adapter cable. A plug adapter might be too large for the telephone housing.

For your first attempts, an inexpensive microphone will certainly be sufficient. However, it’s worth spending a bit more money: The microphone is not placed in the telephone handset (i.e., near your mouth) but in the housing behind some slits. This means it can be a good meter away from you โ€“ which is not a problem with a good microphone.

If you want, you can of course install the microphone in the telephone handset โ€“ but then an additional cable will run from the phone to the handset. What would be very difficult, however, is using the existing microphone there. I myself was unsuccessful in sending a usable audio signal to the Raspberry Pi via this method.

The Speaker in the Telephone Handset

With the microphone, you might be cheating a bit (possibly) as it’s not in the telephone handset. It’s different with the speaker โ€“ here the original is used. The line output of the Raspberry Pi is perfect for this: You prepare a jack cable (3.5 mm) with two flat connectors (2.8 mm) and plug the latter into the handset’s socket.

First, the cable: You can use a mono (one ring on the plug) or stereo cable (two rings). Cut off a piece about 10 cm long and strip the ends:

3,5mm Jack

In the photo, you can see a stereo cable that houses three wires: a red, white, and yellow one. The first two transmit the right and left channels, the yellow one (which might be black in your case) is the ground. For the connection to the speaker in the handset, you need the ground and either the red or white cable.

Solder a flat connector with a width of 2.8 mm to each of these ends. These fit perfectly into the connection socket of the handset, which you can first carefully pull out of its connections in the phone. Now plug the two flat connectors to the yellow and green cables into the socket:

Audio Wires for the Telephone Receiver

Note: Depending on which phone you’re using, the colors of the cables leading to the speaker in the handset may of course differ. In this case, unscrew the telephone handset and check quickly which cables you need to connect to the jack plug.

With that, your handset is ready to use. If you want to test it in advance, connect the jack plug to the Raspberry Pi and play an audio file. The speaker in the telephone handset should now play it.

The Button

Let’s move on to the mechanical part of the hardware โ€“ a button with which you control the dial tone in the telephone handset and of course can also end the conversation with ChatGPT.

For the connection to the Raspberry Pi, you need two jumper cables besides the button. These should be about 20 cm long so that you can place the button at a suitable spot in the housing โ€“ more on that in a moment. If you only have shorter cables on hand, you can also connect them together.

One pin of the button is connected to ground (GND) and the other to pin 11 (GPIO17) on the Raspberry Pi (You’ll see the wiring diagram shortly). You don’t need a pull-up or pull-down resistor as you’ll use the internal resistor.

Now the question is, where to put the button in the housing? It would be particularly practical if it didn’t have to be pressed separately but is activated as soon as the handset is picked up. For this, you can place the button under the fork mechanism, as shown in the following image:

Location of the Raspberry Pi inside the telephone

When the handset is on the hook, the telephone fork is pressed down โ€“ the mechanism then also presses the button down. As soon as you pick up the handset, the mechanism jumps up and also releases the button โ€“ and the Python script is started or you can use the rotary dial. However, this only works if the button is light enough for the weight of the handset to be sufficient to press it down. Some experimenting is required on your part. If you can’t find a suitable button, it will have to work the other way around: First pick up the handset and manually press the telephone fork down to start the script.

The Rotary Dial

With the rotary dial, you can dial a number (in the following, this is simply the telephone number 1) to start ChatGPT. For this, you of course need to connect it to your Raspberry Pi so that you can read the dialed number later.

To do this, however, you first need to find out which cables carry the pulses. If you only have two cables on your rotary dial, it’s easy: One delivers the pulses, the other you connect to GND. In my case, however, there are four unlabeled cables, which is why I measured with a multimeter.

Set the multimeter to resistance measurement and connect any two cables with the device. Now dial a number โ€“ if your multimeter shows a reading, you’ve found the right combination.

Testing the rotary dial with a multimeter

Now connect these two cables to the Raspberry Pi: One to GND, the other to pin 16 (GPIO23). Here’s the wiring diagram for the button and rotary dial connections:

How to connect Button and Rotary Dial on the Raspberry Pi

And that’s it for the hardware. If you want to use SSH to access the Raspberry Pi from your computer (like I did in the following text), you can already install it in your phone. If you’re using a mouse, keyboard, and monitor, it’s better to wait a bit before installing it.

Setting Up the Raspberry Pi

Before setting up your Raspberry Pi, you need an account and an API key from OpenAI. You also need to add some money to your wallet there, as using the API is not free (but not expensive for this project).

In this tutorial, you’ll learn how to create an account and an API key with OpenAI. You can find the current API prices here.

You’ll need the API key later for your Python script. But first, let’s continue with the operating system and the required libraries.

Installing the Operating System

To configure the operating system and write it to an SD card, proceed as follows:

  1. Download the Raspberry Pi Imager from the official website
  2. Launch the Imager and select your model (in this project, that’s a Raspberry Pi Zero 2) and “Raspberry Pi OS Lite (64-bit)”. You’ll find this under Raspberry Pi OS (other) and it comes without a graphical interface, as we don’t need it.
  3. Select your SD card as the destination
  4. Click on Edit Settings in the next screen and:
    • Set a username and password
    • Configure your Wi-Fi (SSID and password)
    • Activate SSH under the Services tab (use password for authentication)
  5. Confirm your settings with a click on Yes and write the image to the SD card

Connect to the Raspberry Pi via SSH

Once the Raspberry Imager is finished, insert the SD card into the corresponding slot on the Raspberry Pi and start it up. Now you’ll need some patience – the first start with the new operating system can take a few minutes. Open the terminal or console on your computer and connect with the following command – replacing “pi” with the username you set previously in the Raspberry Pi Imager.

sudo ssh pi@raspberrypi.local

Once your Raspberry Pi is ready, you’ll be prompted to enter the password you set in the Pi Imager twice.

Update the System

Now you can update the operating system:

sudo apt update
sudo apt upgrade -y

Once that’s done, you can continue with three necessary packages: Pip for installing Python libraries, Venv for the virtual environment in which you’ll run this, and FFmpeg to create audio files that you’ll need later in the script. These packages may already be installed โ€“ but to be safe, run the following command in the terminal:

sudo apt install -y python3-pip python3-venv ffmpeg

For GPIO access and audio also:

sudo apt install -y python3-gpiozero python3-rpi.gpio portaudio19-dev

Set Up a Virtual Environment

Create a directory for your project and set up a virtual environment:

mkdir -p ~/Desktop/callGPT
cd ~/Desktop/callGPT
python3 -m venv venv
source venv/bin/activate

___STEADY_PAYWALL___

Install Libraries

You’re now in the active virtual environment: Next, install the required Python libraries:

pip install openai python-dotenv pygame pyaudio numpy wave gpiozero RPi.GPIO lgpio gtts

Prepare Audio Files

Create the required audio files in the project directory. First, a dial tone for the phone โ€“ how about an A at 440 Hz?

ffmpeg -f lavfi -i "sine=frequency=440:duration=3" -c:a libmp3lame a440.mp3

Now you also need two error messages, also as audio โ€“ because since your phone doesn’t have a display, you should hear these:

python -c "from gtts import gTTS; tts = gTTS('Please try again', lang='de'); tts.save('tryagain.mp3')"

python -c "from gtts import gTTS; tts = gTTS('Sorry, an error occured', lang='de'); tts.save('error.mp3')"

Configure API Key

Since your API key should not appear directly in your Python script for security reasons, create a .env file with your OpenAI API key:

echo "OPENAI_API_KEY=dein openai-api-key" > .env

Creating the Python Script

Now it’s time for the core of the project: the Python script. Create an empty file with:

nano callGPT.py

Paste the entire following code with CTRL+V.

#!/usr/bin/env python3
"""
ChatGPT for Rotary Phone
-------------------------------------------------------------
https://en.polluxlabs.net

MIT License
Copyright (c) 2025 Frederik Kumbartzki

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
"""

import os
import sys
import time
import threading
from queue import Queue
from pathlib import Path

# Audio and speech libraries
os.environ['PYGAME_HIDE_SUPPORT_PROMPT'] = "hide"
import pygame
import pyaudio
import numpy as np
import wave
from openai import OpenAI

# OpenAI API Key
from dotenv import load_dotenv
load_dotenv()
OPENAI_API_KEY = os.environ.get("OPENAI_API_KEY")
if not OPENAI_API_KEY:
    print("Error: OPENAI_API_KEY not found.")
    sys.exit(1)

# Hardware libraries
from gpiozero import Button

# Constants and configurations
AUDIO_DIR = "/home/pi/Desktop/callGPT"
AUDIO_FILES = {
    "tone": f"{AUDIO_DIR}/a440.mp3",
    "try_again": f"{AUDIO_DIR}/tryagain.mp3",
    "error": f"{AUDIO_DIR}/error.mp3"
}
DIAL_PIN = 23  # GPIO pin for rotary dial
SWITCH_PIN = 17  # GPIO pin for hook switch

# Audio parameters
AUDIO_FORMAT = pyaudio.paInt16
CHANNELS = 1
SAMPLE_RATE = 16000
CHUNK_SIZE = 1024
SILENCE_THRESHOLD = 500
MAX_SILENCE_CHUNKS = 20  # About 1.3 seconds of silence
DEBOUNCE_TIME = 0.1  # Time in seconds for debouncing button inputs


class AudioManager:
    """Manages audio playback and recording."""
    
    def __init__(self):
        pygame.mixer.init(frequency=44100, buffer=2048)
        self.playing_audio = False
        self.audio_thread = None
        
        # Create temp directory
        self.temp_dir = Path(__file__).parent / "temp_audio"
        self.temp_dir.mkdir(exist_ok=True)
        
        # Preload sounds
        self.sounds = {}
        for name, path in AUDIO_FILES.items():
            try:
                self.sounds[name] = pygame.mixer.Sound(path)
            except:
                print(f"Error loading {path}")
    
    def play_file(self, file_path, wait=True):
        try:
            sound = pygame.mixer.Sound(file_path)
            channel = sound.play()
            
            if wait and channel:
                while channel.get_busy():
                    pygame.time.Clock().tick(30)
        except:
            pygame.mixer.music.load(file_path)
            pygame.mixer.music.play()
            
            if wait:
                while pygame.mixer.music.get_busy():
                    pygame.time.Clock().tick(30)
    
    def start_continuous_tone(self):
        self.playing_audio = True
        
        if self.audio_thread and self.audio_thread.is_alive():
            self.playing_audio = False
            self.audio_thread.join(timeout=1.0)
            
        self.audio_thread = threading.Thread(target=self._play_continuous_tone)
        self.audio_thread.daemon = True
        self.audio_thread.start()
    
    def _play_continuous_tone(self):
        try:
            if "tone" in self.sounds:
                self.sounds["tone"].play(loops=-1)
                while self.playing_audio:
                    time.sleep(0.1)
                self.sounds["tone"].stop()
            else:
                pygame.mixer.music.load(AUDIO_FILES["tone"])
                pygame.mixer.music.play(loops=-1)
                
                while self.playing_audio:
                    time.sleep(0.1)
                    
                pygame.mixer.music.stop()
        except Exception as e:
            print(f"Error during tone playback: {e}")
    
    def stop_continuous_tone(self):
        self.playing_audio = False
        
        if "tone" in self.sounds:
            self.sounds["tone"].stop()
            
        if pygame.mixer.get_init() and pygame.mixer.music.get_busy():
            pygame.mixer.music.stop()


class SpeechRecognizer:
    """Handles real-time speech recognition using OpenAI's Whisper API."""
    
    def __init__(self, openai_client):
        self.client = openai_client
        self.audio = pyaudio.PyAudio()
        self.stream = None
    
    def capture_and_transcribe(self):
        # Setup audio stream if not already initialized
        if not self.stream:
            self.stream = self.audio.open(
                format=AUDIO_FORMAT,
                channels=CHANNELS,
                rate=SAMPLE_RATE,
                input=True,
                frames_per_buffer=CHUNK_SIZE,
            )
        
        # Set up queue and threading
        audio_queue = Queue()
        stop_event = threading.Event()
        
        # Start audio capture thread
        capture_thread = threading.Thread(
            target=self._capture_audio, 
            args=(audio_queue, stop_event)
        )
        capture_thread.daemon = True
        capture_thread.start()
        
        # Process the audio
        result = self._process_audio(audio_queue, stop_event)
        
        # Cleanup
        stop_event.set()
        capture_thread.join()
        
        return result
    
    def _capture_audio(self, queue, stop_event):
        while not stop_event.is_set():
            try:
                data = self.stream.read(CHUNK_SIZE, exception_on_overflow=False)
                queue.put(data)
            except KeyboardInterrupt:
                break
    
    def _process_audio(self, queue, stop_event):
        buffer = b""
        speaking = False
        silence_counter = 0
        
        while not stop_event.is_set():
            if not queue.empty():
                chunk = queue.get()
                
                # Check volume
                data_np = np.frombuffer(chunk, dtype=np.int16)
                volume = np.abs(data_np).mean()
                
                # Detect speaking
                if volume > SILENCE_THRESHOLD:
                    speaking = True
                    silence_counter = 0
                elif speaking:
                    silence_counter += 1
                
                # Add chunk to buffer
                buffer += chunk
                
                # Process if we've detected end of speech
                if speaking and silence_counter > MAX_SILENCE_CHUNKS:
                    print("Processing speech...")
                    
                    # Save to temp file
                    temp_file = Path(__file__).parent / "temp_recording.wav"
                    self._save_audio(buffer, temp_file)
                    
                    # Transcribe
                    try:
                        return self._transcribe_audio(temp_file)
                    except Exception as e:
                        print(f"Error during transcription: {e}")
                        buffer = b""
                        speaking = False
                        silence_counter = 0
        
        return None
    
    def _save_audio(self, buffer, file_path):
        with wave.open(str(file_path), "wb") as wf:
            wf.setnchannels(CHANNELS)
            wf.setsampwidth(self.audio.get_sample_size(AUDIO_FORMAT))
            wf.setframerate(SAMPLE_RATE)
            wf.writeframes(buffer)
    
    def _transcribe_audio(self, file_path):
        with open(file_path, "rb") as audio_file:
            transcription = self.client.audio.transcriptions.create(
                model="whisper-1", 
                file=audio_file,
                language="en"
            )
        
        return transcription.text
    
    def cleanup(self):
        if self.stream:
            self.stream.stop_stream()
            self.stream.close()
            self.stream = None
        if self.audio:
            self.audio.terminate()
            self.audio = None


class ResponseGenerator:
    """Generates and speaks streaming responses from OpenAI's API."""
    
    def __init__(self, openai_client, temp_dir):
        self.client = openai_client
        self.temp_dir = temp_dir
        self.answer = ""
    
    def generate_streaming_response(self, user_input, conversation_history=None):
        self.answer = ""
        collected_messages = []
        chunk_files = []
        
        # Audio playback queue and control variables
        audio_queue = Queue()
        playing_event = threading.Event()
        stop_event = threading.Event()
        
        # Start the audio playback thread
        playback_thread = threading.Thread(
            target=self._audio_playback_worker,
            args=(audio_queue, playing_event, stop_event)
        )
        playback_thread.daemon = True
        playback_thread.start()
        
        # Prepare messages
        messages = [
            {"role": "system", "content": "You are a humorous conversation partner engaged in a natural phone call. Keep your answers concise and to the point."}
        ]
        
        # Use conversation history if available, but limit to last 4 pairs
        if conversation_history and len(conversation_history) > 0:
            if len(conversation_history) > 8:
                conversation_history = conversation_history[-8:]
            messages.extend(conversation_history)
        else:
            messages.append({"role": "user", "content": user_input})
        
        # Stream the response
        stream = self.client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            stream=True
        )
        
        # Variables for sentence chunking
        sentence_buffer = ""
        chunk_counter = 0
        
        for chunk in stream:
            if chunk.choices and hasattr(chunk.choices[0], 'delta') and hasattr(chunk.choices[0].delta, 'content'):
                content = chunk.choices[0].delta.content
                if content:
                    collected_messages.append(content)
                    sentence_buffer += content
                    
                    # Process when we have a complete sentence or phrase
                    if any(end in content for end in [".", "!", "?", ":"]) or len(sentence_buffer) > 100:
                        # Generate speech for this chunk
                        chunk_file_path = self.temp_dir / f"chunk_{chunk_counter}.mp3"
                        try:
                            # Generate speech
                            response = self.client.audio.speech.create(
                                model="tts-1", 
                                voice="alloy",  
                                input=sentence_buffer,
                                speed=1.0
                            )
                            response.stream_to_file(str(chunk_file_path))
                            chunk_files.append(str(chunk_file_path))
                            
                            # Add to playback queue
                            audio_queue.put(str(chunk_file_path))
                            
                            # Signal playback thread if it's waiting
                            playing_event.set()
                            
                        except Exception as e:
                            print(f"Error generating speech for chunk: {e}")
                        
                        # Reset buffer and increment counter
                        sentence_buffer = ""
                        chunk_counter += 1
        
        # Process any remaining text
        if sentence_buffer.strip():
            chunk_file_path = self.temp_dir / f"chunk_{chunk_counter}.mp3"
            try:
                response = self.client.audio.speech.create(
                    model="tts-1",
                    voice="alloy",
                    input=sentence_buffer,
                    speed=1.2
                )
                response.stream_to_file(str(chunk_file_path))
                chunk_files.append(str(chunk_file_path))
                audio_queue.put(str(chunk_file_path))
                playing_event.set()
            except Exception as e:
                print(f"Error generating final speech chunk: {e}")
        
        # Signal end of generation
        audio_queue.put(None)  # Sentinel to signal end of queue
        
        # Wait for playback to complete
        playback_thread.join()
        stop_event.set()  # Ensure the thread stops
        
        # Combine all messages
        self.answer = "".join(collected_messages)
        print(self.answer)
        
        # Clean up temp files
        self._cleanup_temp_files(chunk_files)
        
        return self.answer
    
    def _audio_playback_worker(self, queue, playing_event, stop_event):
        while not stop_event.is_set():
            # Wait for a signal that there's something to play
            if queue.empty():
                playing_event.wait(timeout=0.1)
                playing_event.clear()
                continue
            
            # Get the next file to play
            file_path = queue.get()
            
            # None is our sentinel value to signal end of queue
            if file_path is None:
                break
                
            try:
                # Play audio and wait for completion
                pygame.mixer.music.load(file_path)
                pygame.mixer.music.play()
                
                # Wait for playback to complete before moving to next chunk
                while pygame.mixer.music.get_busy() and not stop_event.is_set():
                    pygame.time.Clock().tick(30)
                    
                # Small pause between chunks for more natural flow
                time.sleep(0.05)
                
            except Exception as e:
                print(f"Error playing audio chunk: {e}")
    
    def _cleanup_temp_files(self, file_list):
        # Wait a moment to ensure files aren't in use
        time.sleep(0.5)
        
        for file_path in file_list:
            try:
                if os.path.exists(file_path):
                    os.remove(file_path)
            except Exception as e:
                print(f"Error removing temp file: {e}")


class RotaryDialer:
    """Handles rotary phone dialing and services."""
    
    def __init__(self, openai_client):
        self.client = openai_client
        self.audio_manager = AudioManager()
        self.speech_recognizer = SpeechRecognizer(openai_client)
        self.response_generator = ResponseGenerator(openai_client, self.audio_manager.temp_dir)
        
        # Set up GPIO
        self.dial_button = Button(DIAL_PIN, pull_up=True)
        self.switch = Button(SWITCH_PIN, pull_up=True)
        
        # State variables
        self.pulse_count = 0
        self.last_pulse_time = 0
        self.running = True
    
    def start(self):
        # Set up callbacks
        self.dial_button.when_pressed = self._pulse_detected
        self.switch.when_released = self._handle_switch_released
        self.switch.when_pressed = self._handle_switch_pressed
        
        # Start in ready state
        if not self.switch.is_pressed:
            # Receiver is picked up
            self.audio_manager.start_continuous_tone()
        else:
            # Receiver is on hook
            print("Phone in idle state. Pick up the receiver to begin.")
        
        print("Rotary dial ready. Dial a number when the receiver is picked up.")
        
        try:
            self._main_loop()
        except KeyboardInterrupt:
            print("Terminating...")
            self._cleanup()
    
    def _main_loop(self):
        while self.running:
            self._check_number()
            time.sleep(0.1)
    
    def _pulse_detected(self):
        if not self.switch.is_pressed:
            current_time = time.time()
            if current_time - self.last_pulse_time > DEBOUNCE_TIME:
                self.pulse_count += 1
                self.last_pulse_time = current_time
    
    def _check_number(self):
        if not self.switch.is_pressed and self.pulse_count > 0:
            self.audio_manager.stop_continuous_tone()
            time.sleep(1.5)  # Wait between digits
            
            if self.pulse_count == 10:
                self.pulse_count = 0  # "0" is sent as 10 pulses
            
            print("Dialed service number:", self.pulse_count)
            
            if self.pulse_count == 1:
                self._call_gpt_service()
                # Return to dial tone after conversation
                if not self.switch.is_pressed:  # Only if the receiver wasn't hung up
                    self._reset_state()
            
            self.pulse_count = 0
    
    def _call_gpt_service(self):
        # Conversation history for context
        conversation_history = []
        first_interaction = True
        
        # For faster transitions
        speech_recognizer = self.speech_recognizer
        response_generator = self.response_generator

        # Preparation for next recording
        next_recording_thread = None
        next_recording_queue = Queue()
        
        # Conversation loop - runs until the receiver is hung up
        while not self.switch.is_pressed:
            # If there's a prepared next recording thread, use its result
            if next_recording_thread:
                next_recording_thread.join()
                recognized_text = next_recording_queue.get()
                next_recording_thread = None
            else:
                # Only during first iteration or as fallback
                print("Listening..." + (" (Speak now)" if first_interaction else ""))
                first_interaction = False
                
                # Start audio processing
                recognized_text = speech_recognizer.capture_and_transcribe()
            
            if not recognized_text:
                print("Could not recognize your speech")
                self.audio_manager.play_file(AUDIO_FILES["try_again"])
                continue
            
            print("Understood:", recognized_text)
            
            # Update conversation history
            conversation_history.append({"role": "user", "content": recognized_text})
            
            # Start the next recording thread PARALLEL to API response
            next_recording_thread = threading.Thread(
                target=self._background_capture, 
                args=(speech_recognizer, next_recording_queue)
            )
            next_recording_thread.daemon = True
            next_recording_thread.start()
            
            # Generate the response
            response = response_generator.generate_streaming_response(recognized_text, conversation_history)
            
            # Add response to history
            conversation_history.append({"role": "assistant", "content": response})
            
            # Check if the receiver was hung up in the meantime
            if self.switch.is_pressed:
                break
        
        # If we get here, the receiver was hung up
        if next_recording_thread and next_recording_thread.is_alive():
            next_recording_thread.join(timeout=0.5)

    def _background_capture(self, recognizer, result_queue):
        try:
            result = recognizer.capture_and_transcribe()
            result_queue.put(result)
        except Exception as e:
            print(f"Error in background recording: {e}")
            result_queue.put(None)
    
    def _reset_state(self):
        self.pulse_count = 0
        self.audio_manager.stop_continuous_tone()
        self.audio_manager.start_continuous_tone()
        print("Rotary dial ready. Dial a number.")
    
    def _handle_switch_released(self):
        print("Receiver picked up - System restarting")
        self._restart_script()
    
    def _handle_switch_pressed(self):
        print("Receiver hung up - System terminating")
        self._cleanup()
        self.running = False
        
        # Complete termination after short delay
        threading.Timer(1.0, self._restart_script).start()
        return
    
    def _restart_script(self):
        print("Script restarting...")
        self.audio_manager.stop_continuous_tone()
        os.execv(sys.executable, ['python'] + sys.argv)
    
    def _cleanup(self):
        # Terminate Audio Manager
        self.audio_manager.stop_continuous_tone()
        
        # Terminate Speech Recognizer if it exists
        if hasattr(self, 'speech_recognizer') and self.speech_recognizer:
            self.speech_recognizer.cleanup()
            
        print("Resources have been released.")


def main():
    # Initialize OpenAI client
    client = OpenAI(api_key=OPENAI_API_KEY)
    
    # Create and start the rotary dialer
    dialer = RotaryDialer(client)
    dialer.start()
    
    print("Program terminated.")

if __name__ == "__main__":
    main()

Now press CTRL+O to save and CTRL+X to exit.

Make the script directly executable:

chmod +x callGPT.py

Your First Phone Call

Start the script with the following command:

python3 callGPT.py 2>/dev/null

The suffix “2>/dev/null” simply ensures that harmless messages in the terminal are suppressed and thus not displayed.

Once the script is running, pick up the phone receiver and hold it to your ear. You should now hear the dial tone. Then dial 1 and start your conversation. It might not be the liveliest phone conversation, as it always takes a few seconds for ChatGPT to respond โ€“ but as mentioned at the beginning, you can refer to previous statements within the conversation.

When you want to end the conversation, simply hang up the receiver. ๐Ÿ™‚

Autostart

It would be practical, of course, if the script automatically started when your Raspberry Pi boots up. You can achieve this as follows:

  1. Create a service file:
sudo nano /etc/systemd/system/callgpt.service
  1. Add the following content and save the file:
[Unit]
Description=Rotary Phone GPT Service
After=network.target

[Service]
ExecStart=/home/pi/Desktop/callGPT/venv/bin/python3 /home/pi/Desktop/callGPT/callGPT.py
WorkingDirectory=/home/pi/Desktop/callGPT
StandardOutput=inherit
StandardError=inherit
Restart=always
RestartSec=10
User=pi
Environment="OPENAI_API_KEY=your_api_key_here"

[Install]
WantedBy=multi-user.target
  1. Enable and start the service:
sudo systemctl enable callgpt.service
sudo systemctl start callgpt.service
  1. Check the status:
sudo systemctl status callgpt.service

You should see the following output in the terminal:

Laufender callgpt.service mit systemmd

Now restart your Raspberry Pi:

sudo reboot

After a few seconds, pick up the receiver โ€“ you should hear the dial tone again.

Areas for Customization

1. OpenAI Model Selection

# Stream the response
stream = self.client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    stream=True
)

You can change the model to use other OpenAI models like “gpt-4” or “gpt-3.5-turbo” depending on your preference for response quality vs. cost.

2. System Prompt Customization

# Prepare messages
messages = [
    {"role": "system", "content": "You are a humorous conversation partner engaged in a natural phone call. Keep your answers concise and to the point."}]

This is where you can customize the personality and behavior of the AI. You could make it more formal, give it a specific character, or program it with specific knowledge.

3. Voice Selection and Speed

# Generate speech
response = self.client.audio.speech.create(
    model="tts-1", 
    voice="alloy",
    input=sentence_buffer,
    speed=1.0
)

The voice can be changed to other OpenAI TTS voices like “nova”, “echo”, “fable”, “onyx”, or “shimmer”. The speed parameter can also be adjusted for faster or slower speech.

4. Language Settings

transcription = self.client.audio.transcriptions.create(
    model="whisper-1", 
    file=audio_file,
    language="en"
)

The language parameter can be changed to match your preferred language (e.g., “en” for English, “fr” for French).

5. Hardware Configurations

# Constants and configurations
AUDIO_DIR = "/home/pi/Desktop/callGPT"
AUDIO_FILES = {
    "tone": f"{AUDIO_DIR}/a440.mp3",
    "try_again": f"{AUDIO_DIR}/tryagain.mp3",
    "error": f"{AUDIO_DIR}/error.mp3"
}
DIAL_PIN = 23  # GPIO pin for rotary dial
SWITCH_PIN = 17  # GPIO pin for hook switch

These pin numbers and audio file paths can be adjusted based on the specific wiring setup and preferences.

6. Audio Detection Parameters

# Audio parameters
AUDIO_FORMAT = pyaudio.paInt16
CHANNELS = 1
SAMPLE_RATE = 16000
CHUNK_SIZE = 1024
SILENCE_THRESHOLD = 500
MAX_SILENCE_CHUNKS = 20  # About 1.3 seconds of silence

These parameters affect how the system detects and processes speech. The SILENCE_THRESHOLD determines how loud a sound needs to be to register as speech, and MAX_SILENCE_CHUNKS controls how long a silence needs to be to register as the end of a speech segment.

7. Conversation History Management

# Use conversation history if available, but limit to last 4 pairs
if conversation_history and len(conversation_history) > 0:
    if len(conversation_history) > 8:
        conversation_history = conversation_history[-8:]
    messages.extend(conversation_history)

This controls how much conversation history is retained. You could adjust the number to keep more or less context, affecting both conversation quality and API usage.

And that’s it! You now have an old rotary dial telephone that you can use to call ChatGPT. Currently, only the “telephone number” 1 has a function โ€“ but of course you can include many more ideas here: current news, a web search, the weather forecast… let your ideas run wild!

We don't track you. Enjoy your cookies while making awesome projects!