In this article, we'll share how we've used AI in our clients' projects, the technologies we used, and the challenges we solved, such as object recognition, content generation, and creating virtual assistants.
You can also read about our AI development and integration experiences here
FRP: virtual assistant, playlist generation, and music recognition
Franchise Record Pool is a tool for professional DJs. It offers a catalog of 720,000 licensed tracks from labels like Sony Music, Universal, and Virgin Records.
With integration into Serato DJ software, DJs can create tracks directly in FRP without using third-party services. The catalog also provides essential information about each track, such as key, BPM (beats per minute), sources, and all existing remixes.
We’ve also added a virtual AI assistant that responds to voice commands and a music recognition feature similar to Shazam.
Virtual Assistant
To prepare for performances, DJs need to create playlists of tracks that match specific parameters like tone, tempo, genre, and style.
To help with this, we developed a virtual AI assistant that responds to voice commands. A DJ can ask the assistant to create a playlist with specific criteria, such as "Make a playlist with Latin pop music, bpm 150."
The AI searches the FRP track database for matching tracks and creates a playlist that can be saved or downloaded. DJs can also give more details to refine the track selection even more.
Music Recognition
We developed a music recognition feature so that during a performance, a DJ can identify tracks that another DJ has remixed and instantly add them to their collection.
The AI recognizes the remix, provides a list of the original tracks used, and shows how many times those tracks have been remixed.
Technologies
To develop AI functions for FRP, we utilized several tools and APIs:
- OpenAI API: Used to identify the type of playlist a DJ wants to create (genre, tempo, tonality), and generate playlist titles and descriptions.
- Whisper: Used to transcribe users' voice commands into text.
- Amazon Polly: Used to convert text into spoken responses for the virtual assistant.
These AI features have significantly enhanced the user experience in FRP. The virtual assistant and music recognition capabilities make it simpler for DJs to discover and select music for performances, creating a more personalized process.
BlaBlaPlay: smart recording prompts for users, content moderation, and feed recommendations
BlaBlaPlay is an anonymous social network where users freely exchange ideas and connect with like-minded individuals. Users listen to voice message cards and favorite ones to respond to. This opens a chat room with the card's author for continuing the conversation via voice or text. If a user dislikes a voice card, they can swipe left, akin to Tinder, to view the next card.
We integrated AI in three key ways:
- Generating prompts to encourage user interaction.
- Providing smart feed recommendations to guide topic selection for voice recordings.
- Utilizing AI speech recognition to identify and prevent inappropriate language, ensuring a positive community experience.
Recording Prompts Generation
After the app was launched, we noticed that many users were responding to voice cards with silence. To encourage more interaction among users, we implemented AI to generate prompts featuring sample conversation topics. This approach helped users engage more creatively in their responses.
Feed Recommendations
We've also introduced smart recommendations to assist users in selecting a theme for their voice cards. If a user is unsure, the app suggests a random topic based on popular and recently discussed themes among other users. This feature helps streamline the card creation process and encourages more active participation.
Content Moderation
We employed AI for enhanced content moderation. We trained a neural network to identify specific words and phrases that could potentially offend other users. Upon analysis, the neural network provides a label indicating if the voice message contains any of these flagged elements. Administrators can then decide whether to delete the card or block the user based on this information.
Technologies
We leveraged ChatGPT to develop AI functionalities in the following ways:
- OpenAI API: Used for generating hints and smart recommendations.
- Whisper: Employed for transcribing voice cards into text for server-side neural network analysis.
- CoreML: Implemented in mobile applications to enable AI functionalities directly on users' devices, thereby reducing latency and enhancing AI performance.
These AI integrations enabled us to address multiple challenges effectively. Firstly, we enhanced user engagement through AI prompts and smart recommendations featuring sample conversation topics. Secondly, we employed a neural network for additional content moderation, ensuring a safer and more enjoyable user experience.
FashionAI: object recognition and personalized outfit recommendations
FashionAI is our current project in development—a mobile assistant designed to organize users’ closets and curate stylish outfits, all powered by artificial intelligence.
Object Recognition
In many similar apps, users typically take photos of items from their closet and manually input parameters such as type, color, season, and occasion.
In our app, we've developed an AI algorithm that automatically identifies items in photos with precision. This AI can recognize the type of clothing, fabric, color, and pattern, along with details like sleeve length, neckline type, and other style elements crucial for creating fashionable looks.
Given that our app is targeted at the Indian market, we've trained the AI to also recognize Indian ethnic wear.
Outfit Recommendations
Once the entire closet is "digitized," users can receive personalized recommendations for creating outfits. These recommendations consider users' preferences, weather conditions, and calendar events.
During registration, users complete a survey where they specify their clothing style, favorite colors, and celebrities whose fashion they admire. Our AI analyzes this information along with the items in their closet to suggest stylish outfits tailored to the season, weather, and upcoming events.
Technologies
For our AI features, we utilized TensorFlowLite with the YOLOv8m model, Vision, PyTorch, and the CLIP neural network from OpenAI.
In this AI-assisted project, we streamlined closet organization significantly through automatic clothing recognition and categorization. Additionally, our flexible recommendation system, tailored to users' personal preferences, simplifies the process of creating stylish outfits.
ALDA (AI Learning Assistant): learning materials generation
ALDA is another project currently under development — a cutting-edge AI-based virtual assistant designed to assist professors from colleges and universities across the United States in creating curricula and educational courses more efficiently and seamlessly.
Learning Materials Generation
Learning materials for colleges and universities must adhere to specific formats. We've trained our AI to generate syllabi and courses following customizable structures so that users can define themselves. This allows users at each institution to create their templates.
Users simply need to specify the subject, specific topic, and course difficulty level. The AI then generates a course outline complete with detailed descriptions for each item, lecture content, and even assessment questions to gauge student understanding. Users have the flexibility to add context and edit the generated materials as required.
Technologies
We utilized the OpenAI Assistant API powered by GPT-4 to develop an AI assistant.
Implementing AI has streamlined the creation and customization of training materials to align with the unique needs of various colleges and universities. Furthermore, it has expedited the process of updating and editing these materials, with the AI automatically updating all pertinent items as needed.
Translinguist: AI-powered machine translation and interpretation
Translinguist is a platform designed for events featuring interpreters. During video chats, users can access simultaneous and consecutive interpretation, as well as sign language interpretation. Each participant hears only the translation they require and sees automatically generated subtitles when necessary.
Additionally, we have integrated an AI-based machine translation feature.
Machine translation and interpretation
Artificial intelligence recognizes the user's speech and automatically translates it into the selected language. The platform supports 62 languages worldwide.
The AI accurately voices the translation, capturing nuances such as pace, intonation, and pauses to mimic natural speech. Neural network processing and analysis ensure that extraneous noise is minimized and the context of speech is correctly interpreted. This includes handling special terms, names, and titles specific to each language.
Technologies
To implement AI functions, we integrated three types of services that work together seamlessly: Speech-to-text, Text-to-speech, and Text-to-text. The system selects the most suitable component based on the languages used.
AI-based machine translation has significantly streamlined the process of translating multiple languages during video conferences. Rather than sourcing live translators for each language pair, users can opt for machine translation for certain languages, ensuring comparable quality and efficiency.
To sum up
In our projects, we've implemented artificial intelligence features across three main categories:
- AI recognition
- AI generation
- AI recommendations
You can find more about our experience in AI development and integration here
Interested in developing your own AI-powered project? Contact us or book a quick call
We offer a free personal consultation to discuss your project goals and vision, recommend the best technology, and prepare a custom architecture plan.
Comments