Lucida
Speech and Vision Based Intelligent Personal Assistant
Install / Use
/learn @claritylab/LucidaREADME
Lucida
Lucida is a speech and vision based intelligent personal assistant inspired by Sirius. Visit our website for tutorial, and Lucida-users for help. The project is released under BSD license, except certain submodules contain their own specific licensing information. We would love to have your help on improving Lucida, and see CONTRIBUTING for more details.
Overview
-
lucida: back-end services and command center (CMD). Currently, there are 7 categories of back-end services: "ASR" (automatic speech recognition), "IMM" (image matching), "QA" (question answering), "CA" (calendar events retrieval), "IMC" (image classification), "FACE" (facial recognition), and "DIG" (digit recognition).You can delete or replace these services with your own, or you can simply add a new service. For example, if you know some better ASR implementation, have an interesting image captioning end-to-end system, or have access to a quality machine translation algorithm, please read the section "How to Add Your Own Service into Lucida?" below.
The command center determines which services are needed based on the user input, sends requests to them, and returns response to the user. In the following diagram, the user asks a query that needs the following three services: ASR, IMM, and QA. The "cloud" behind each box means the Docker container(s) running on the host machine(s).
<p align="center"> <img src="high_level.png" alt="" width="600" /> </p> -
tools: dependencies necessary for compiling Lucida. Due to the fact that services share some common dependencies, all services should be compiled after these dependencies are installed. The advantage of a central point of dependencies is that the total size of compiled services is minimized; the disadvantage is that it makes deleting a service from Lucida non-trivial -- you have to remove its dependencies intools.
Lucida Local Development
If you want to make contributions to Lucida, please build it locally:
-
From this directory, type:
make local. This will run scripts intools/to install all the required dependencies. After that, it will compile back-end services inlucida/. -
Important note for Ubuntu 16.04 users: please read note #1.
-
If for some reason you need to compile part of it (e.g. one back-end service), make sure to set the following environment variable as set in
Makefile:export LD_LIBRARY_PATH=/usr/local/libYou can add it permanently to your bash profile.
-
Start all services:
make start_allThis will spawn a terminal window (
gnome-terminal) for each service as well as the command center. Once they all start running, open your browser and visithttp://localhost:3000/. Check out thetutorialfor usage and sample questions.Currently, the command center receives the user input in the form of HTTP requests sent from your browser, but in future we can support other forms of input.
Lucida Docker Deployment
If you want to use Lucida as a web application, please deploy using Docker and Kubernetes:
-
Install Docker: refer to https://docs.docker.com/engine/installation/.
-
Navigate to
tools/deploy/and follow the instructions there. -
Once done, check out the
tutorialfor usage and sample questions.
REST API for command center
The REST API is in active development and may change drastically. It currently supports only infer and learn. Other features may be added later. An example client for botframework is available. Information on how to use the API can be found in the wiki
Design Notes -- How to Add Your Own Service into Lucida?
Back-end Communication
Thrift is an RPC framework with the advantages of being efficient and language-neutral. It was originally developed by Facebook and now developed by both the open-source community (Apache Thrift) and Facebook. We use both Apache Thrift and Facebook Thrift because Facebook Thrift has a fully asynchronous C++ server but does not support Java very well. Also, Apache Thrift seems to be more popular. Therefore, we recommend using Apache Thrift for services written in Python and Java, and Facebook Thrift for services written in C++. However, you can choose either one for your own service as long as you follow the steps below.
One disadvantage about Thrift is that the interface has to be pre-defined and implemented by each service. If the interface changes, all services have to re-implement the interface. We try to avoid changing the interface by careful design, but if you really need to adapt the interface for your need, feel free to modify, but make sure that all services implement and use the new interface.
Detailed Instructions
You need to configure the command center (CMD) besides implementing the Thrift interface in order to add your own service into Lucida. Let's break it down into two steps:
-
Implement the Thrift interface jointly defined in
lucida/lucidaservice.thriftandlucida/lucidatypes.thrift.-
include "lucidatypes.thrift" service LucidaService { void create(1:string LUCID, 2:lucidatypes.QuerySpec spec); void learn(1:string LUCID, 2:lucidatypes.QuerySpec knowledge); string infer(1:string LUCID, 2:lucidatypes.QuerySpec query); }The basic functionalities that your service needs to provide are called
create,learn, andinfer. They all take in the same type of parameters, astringrepresenting the Lucida user ID (LUCID), and aQuerySpecdefined inlucida/lucidatypes.thrift. The command center invokes these three procedures implemented by your service, and services can also invoke these procedures on each other to achieve communication. Thus the typical data flow looks like this:Command Center (CMD) -> Your Own Service (YOS)But it also can be like this:
Command Center (CMD) -> Your Own Service 0 (YOS0) -> Your Own Service 1 (YOS1) -> Your Own Service 2 (YOS2)In this scenario, make sure to implement the asynchronous Thrift interface. If YOS0 implements the asynchronous Thrift interface, it won't block on waiting for the response from YOS1. If YOS0 implements the synchronous Thrift interface, it cannot make progress until YOS1 returns the response, so the operating system will perform a thread context switch, and let the current thread sleep until YOS1 returns. See section 3 of step 1 for implementation details.
create: create an intelligent instance based on supplied LUCID. It gives services a chance to warm up the pipeline, but our current services do not need that. Therefore, the command center does not sendcreaterequest at this point. If your service needs to warm up for each user, make sure to modify the command center which is detailed in step 2.learn: tell the intelligent instance to learn new knowledge based on data supplied in the query, which usually means the training process. Although it has be implemented, you can choose to do nothing in the function body if your service cannot learn new knowledge. For example, it may be hard to retrain a DNN model, so the facial recognition service simply prints a message when it receives a learn request. Otherwise, consider using a database system to store the new knowledge. Currently, we use MongoDB to store the text and image knowledge. You need to tell the command center whether to send a learn request to your service or not, which is detailed in step 2.infer: ask the intelligence to infer using the data supplied in the query, which usually means the predicting process.Notice all the three functions take in
QuerySpecas their second parameters, so let's see whatQuerySpecmeans for each function. -
struct QueryInput { 1: string type; 2: list<string> data; 3: list<string> tags; } struct QuerySpec { 1: string name; 2: list<QueryInput> content; }A
QuerySpechas a name, which iscreateforcreate,knowledgeforlearn, andqueryforinfer. AQuerySpecalso has a list ofQueryInputcalledcontent, which is the data payload. AQueryInputconsists of atype, a list ofdata, and a list oftags.- If the function call is
learn:
One
QueryInputis constructed by the command center currently, but you should still iterate through allQueryInputs in case for change in future. ForQueryInput,typecan betextfor plain text,urlfor url address to extract text from,imagefor image, orunlearn(undo learn) for the reverse process of learn. Here is our assumptions: a service can handle either text or image, and if it can handle text, the types your service should handle aretext,url, andunlearn, and if it can handle image, the types your service should handle areimageandunlearn. See step 2 for details on how to specify the type of knowledge that your service can learn. Iftypeistextorurl,data[i]is theith piece of text or url as new knowledge andtags[i]is the id of theith piece o - If the function call is
-
Related Skills
node-connect
343.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
90.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
