|
1 |
| -# Module-LLM |
2 |
| - |
3 |
| -<div class="product_pic"><img class="pic" src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/4.webp" width="25%"> |
4 |
| - |
5 |
| -## Description |
6 |
| - |
7 |
| -**Module LLM** is an integrated offline Large Language Model (LLM) inference module designed for terminal devices that require efficient and intelligent interaction. Whether for smart homes, voice assistants, or industrial control, Module LLM provides a smooth and natural AI experience without relying on the cloud, ensuring privacy and stability. Integrated with the **StackFlow** framework and **Arduino/UiFlow** libraries, smart features can be easily implemented with just a few lines of code.<br> |
8 |
| -Powered by the advanced **AX630C** SoC processor, it integrates a 3.2 TOPs high-efficiency NPU with native support for Transformer models, handling complex AI tasks with ease. Equipped with **4GB LPDDR4** memory (1GB available for user applications, 3GB dedicated to hardware acceleration) and **32GB eMMC** storage, it supports parallel loading and sequential inference of multiple models, ensuring smooth multitasking. The main chip's runtime power consumption of approximately 1.5W, making it highly efficient and suitable for long-term operation.<br> |
9 |
| -It features a built-in microphone, speaker, TF storage card, **USB OTG**, and RGB status light, meeting diverse application needs with support for voice interaction and data transfer. The module offers flexible expansion: the onboard SD card slot supports cold/hot firmware upgrades, and the **UART** communication interface simplifies connection and debugging, ensuring continuous optimization and expansion of module functionality. The USB port supports master-slave auto-switching, serving as both a debugging port and allowing connection to additional USB devices like cameras. Users can purchase the LLM debugging kit to add a 100 Mbps Ethernet port and kernel serial port, using it as an SBC.<br> |
10 |
| -The module is compatible with multiple models and comes pre-installed with the **Qwen2.5-0.5B** language model. It features **KWS** (wake word), **ASR** (speech recognition), **LLM** (large language model), and **TTS** (text-to-speech) functionalities, with support for standalone calls or **pipeline** automatic transfer for convenient development. Future support includes Qwen2.5-1.5B, Llama3.2-1B, and InternVL2-1B models, allowing hot model updates to keep up with community trends and accommodate various complex AI tasks. Vision recognition capabilities include support for CLIP, YoloWorld, and future updates for DepthAnything, SegmentAnything, and other advanced models to enhance intelligent recognition and analysis.<br> |
11 |
| -Plug and play with **M5 hosts**, Module LLM offers an easy-to-use AI interaction experience. Users can quickly integrate it into existing smart devices without complex settings, enabling smart functionality and improving device intelligence. This product is suitable for offline voice assistants, text-to-speech conversion, smart home control, interactive robots, and more. |
12 |
| - |
13 |
| - |
14 |
| - |
15 |
| -## Product Features |
16 |
| - |
17 |
| -- Offline inference, 3.2T@INT8 precision computing power |
18 |
| -- Integrated KWS (wake word), ASR (speech recognition), LLM (large language model), TTS (text-to-speech generation) |
19 |
| -- Multi-model parallel processing |
20 |
| -- Onboard 32GB eMMC storage and 4GB LPDDR4 memory |
21 |
| -- Onboard microphone and speaker |
22 |
| -- Serial communication |
23 |
| -- SD card firmware upgrade |
24 |
| -- Supports ADB debugging |
25 |
| -- RGB indicator light |
26 |
| -- Built-in Ubuntu system |
27 |
| -- Supports OTG functionality |
28 |
| -- Compatible with Arduino/UIFlow |
29 |
| - |
30 |
| -> |
31 |
| -
|
32 |
| -## Applications |
33 |
| - |
34 |
| -- Offline voice assistants |
35 |
| -- Text-to-speech conversion |
36 |
| -- Smart home control |
37 |
| -- Interactive robots |
38 |
| - |
39 |
| -## Specifications |
40 |
| - |
41 |
| -| Specification | Parameter | |
42 |
| -| ---------------- | ------------------------------------------------------------------------------------------- | |
43 |
| -| Processor SoC | AX630C@Dual Cortex A53 1.2 GHz <br> MAX.12.8 TOPS @INT4 and 3.2 TOPS @INT8 | |
44 |
| -| Memory | 4GB LPDDR4 (1GB system memory + 3GB dedicated for hardware acceleration) | |
45 |
| -| Storage | 32GB eMMC5.1 | |
46 |
| -| Communication | Serial communication default baud rate 115200@8N1 (adjustable) | |
47 |
| -| Microphone | MSM421A | |
48 |
| -| Audio Driver | AW8737 | |
49 |
| -| Speaker | 8Ω@1W, Size:2014 cavity speaker | |
50 |
| -| Built-in Units | KWS (wake word), ASR (speech recognition), LLM (large language model), TTS (text-to-speech) | |
51 |
| -| RGB Light | 3x RGB LED@2020 driven by LP5562 (status indication) | |
52 |
| -| Power | Idle: 5V@0.5W, Full load: 5V@1.5W | |
53 |
| -| Button | For entering download mode for firmware upgrade | |
54 |
| -| Upgrade Port | SD card / Type-C port | |
55 |
| -| Working Temp | 0-40°C | |
56 |
| -| Product Size | 54x54x13mm | |
57 |
| -| Packaging Size | 133x95x16mm | |
58 |
| -| Product Weight | 17.4g | |
59 |
| -| Packaging Weight | 32.0g | |
60 |
| - |
61 |
| - |
62 |
| -## Related Links |
63 |
| - |
64 |
| -- [AX630C](https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/AX630C.pdf) |
65 |
| - |
66 |
| -## PinMap |
67 |
| - |
68 |
| -| Module LLM | RXD | TXD | |
69 |
| -| ------------ | --- | --- | |
70 |
| -| Core (Basic) | G16 | G17 | |
71 |
| -| Core2 | G13 | G14 | |
72 |
| -| CoreS3 | G18 | G17 | |
73 |
| - |
74 |
| ->LLM Module Pin Switching| LLM Module has reserved soldering pads for pin switching. In cases of pin multiplexing conflicts, the PCB trace can be cut and reconnected to other sets of pins. |
75 |
| -
|
76 |
| -<img alt="module size" src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/03.jpg" width="25%" /> |
77 |
| - |
78 |
| -> Taking `CoreS3` as an example, the first column (left green box) is the TX pin for serial communication, where users can choose one out of four options as needed (from top to bottom, the pins are G18, G7, G14, and G10). The default is set to IO18. To switch to a different pin, cut the connection on the solder pad (at the red line) — it’s recommended to use a blade for this — and then connect to one of the three remaining pins below. The second column (right green box) is for RX pin selection, and, as with the TX pin, it also allows a choice of one out of four options. |
79 |
| -
|
80 |
| - |
81 |
| - |
82 |
| -## Video |
83 |
| - |
84 |
| -- Module LLM product introduction and example showcase [Module_LLM_Video.mp4](https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/Module_LLM_Video.mp4) |
85 |
| - |
86 |
| -## AI Benchmark Comparison |
87 |
| - |
88 |
| -<img alt="compare" src="https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/docs/products/module/Module%20LLM/Benchmark%E5%AF%B9%E6%AF%94.png" width="100%" /> |
| 1 | +# StackFlow |
| 2 | + |
| 3 | +<p align="center"><img src="https://static-cdn.m5stack.com/resource/public/assets/m5logo2022.svg" alt="basic" width="300" height="300"></p> |
| 4 | + |
| 5 | +<p align="center"> |
| 6 | + StackFlow is a simple, fast, and elegant one-stop AI service infrastructure project aimed at embedded developers. Its purpose is to enable Makers and Hackers to quickly obtain powerful AI acceleration capabilities in current embedded devices. StackFlow can infuse a wise soul into various human-machine interaction devices. |
| 7 | +</p> |
| 8 | + |
| 9 | + |
| 10 | + |
| 11 | +## Table of Contents |
| 12 | + |
| 13 | +* [Features](#Features) |
| 14 | +* [Demo](#demo) |
| 15 | +* [SystemRequirements](#SystemRequirements) |
| 16 | +* [Compile](#Compile) |
| 17 | +* [Installation](#Installation) |
| 18 | +* [Upgrade](#Upgrade) |
| 19 | +* [Run](#Run) |
| 20 | +* [Configuration](#Configuration) |
| 21 | +* [Interface](#Interface) |
| 22 | +* [Contribution](#Contribution) |
| 23 | + |
| 24 | + |
| 25 | +## Features |
| 26 | +<!--  --> |
| 27 | +* Distributed communication architecture. Each unit can operate independently or collaborate with other units. |
| 28 | +* Support for multiple models, including but not limited to speech recognition, speech synthesis, image recognition, natural language processing, and LLM large model assistant inference, etc. |
| 29 | +* Internal data flow. Different units can be configured to work together as needed, avoiding complex data processing workflows. |
| 30 | +* Simple and easy to use. Exchange data through standard JSON to quickly implement AI services. |
| 31 | +* Offline operation. Local AI services can be implemented without the need for an internet connection. |
| 32 | +* Multi-platform support, including but not limited to Module LLM, LLM630 Compute Kit, etc. |
| 33 | +* Flexible configuration. All units can be fully configured with operational parameters, allowing for model swapping and modification of model parameters within the same data flow processing scenario. |
| 34 | +* Simple and easy to use. Developers only need to focus on the model and hardware platform without worrying about the underlying communication and data processing details, enabling quick implementation of AI services. |
| 35 | +* Efficient and stable. Data transmission via ZMQ channels ensures high efficiency, low latency, and strong stability. |
| 36 | +* Open source and free. StackFlow is licensed under the MIT License. |
| 37 | +* Multilingual support. The core unit is implemented in C++ with extreme performance optimization and can be extended to support multiple programming languages. (Requires support for ZMQ programming) |
| 38 | + |
| 39 | +StackFlow is continuously being optimized and iterated. While the framework becomes more comprehensive, more features will be added. Stay tuned. |
| 40 | + |
| 41 | +Main working modes of the StackFlow voice assistant: |
| 42 | + |
| 43 | +After startup, KWS, ASR, LLM, TTS, and AUDIO are configured to work collaboratively. When KWS detects a keyword in the audio obtained from the AUDIO unit, it sends a wake-up signal. At this point, ASR starts working, recognizing the audio data from AUDIO and publishing the results to its output channel. Once LLM receives the text data converted by ASR, it begins reasoning and publishes the results to its output channel. TTS, upon receiving the results from LLM, starts voice synthesis and plays the synthesized audio data according to the configuration. |
| 44 | + |
| 45 | + |
| 46 | +## Demo |
| 47 | +- [StackFlow continuous speech recognition](./projects/llm_framework/README.md) |
| 48 | +- [StackFlow LLM large model awakening dialogue](./projects/llm_framework/README.md) |
| 49 | +- [StackFlow TTS voice synthesis playback](./projects/llm_framework/README.md) |
| 50 | +- [StackFlow yolo visual detection](https://github.com/Abandon-ht/ModuleLLM_Development_Guide/tree/dev/ESP32/cpp) |
| 51 | +- [StackFlow VLM image description](https://github.com/Abandon-ht/ModuleLLM_Development_Guide/tree/dev/ESP32/cpp) |
| 52 | + |
| 53 | +## SystemRequirements ## |
| 54 | +The current AI units of StackFlow are built on the AXERA acceleration platform, with the main chip platforms being ax630c and ax650n. The system requirement is Ubuntu. |
| 55 | + |
| 56 | +## Compile ## |
| 57 | +StackFlow mainly operates on embedded Linux devices. Generally, please perform compilation work on the host Linux device. The compilation toolchain is aarch64-none-linux-gnu. |
| 58 | +```bash |
| 59 | +# Install X86 cross-compilation toolchain |
| 60 | +wget https://m5stack.oss-cn-shenzhen.aliyuncs.com/resource/linux/llm/gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu.tar.gz |
| 61 | +sudo tar zxvf gcc-arm-10.3-2021.07-x86_64-aarch64-none-linux-gnu.tar.gz -C /opt |
| 62 | + |
| 63 | +# Install dependencies |
| 64 | +sudo apt install python3 python3-pip libffi-dev |
| 65 | +pip3 install parse scons requests kconfiglib |
| 66 | + |
| 67 | +# Download StackFlow source code |
| 68 | +git clone https://github.com/m5stack/StackFlow.git |
| 69 | +cd StackFlow |
| 70 | +git submodule update --init |
| 71 | +cd projects/llm_framework |
| 72 | +scons distclean |
| 73 | + |
| 74 | +# Compile. Note: When compiling, you need to be connected to the internet to download source code, binary libraries, and other files. Please ensure a stable network connection. |
| 75 | +scons -j22 |
| 76 | + |
| 77 | +# Package the deb file. Note: Due to the large size of LLM model files, packaging the deb file requires a considerable amount of disk space. It is recommended to use disk space of 128GB or more. During packaging, a large number of binary files will be downloaded, so please be aware of data usage. |
| 78 | +cd tools |
| 79 | +python3 llm_pack.py |
| 80 | +``` |
| 81 | + |
| 82 | +## Installation ## |
| 83 | +The program and model data of StackFlow are separate. After the program installation is complete, you need to download the model data separately and configure it into the program. The installation involves first installing the program package, and then installing the model package. |
| 84 | + |
| 85 | +Bare-metal environment installation (execute the following commands on the LLM device): |
| 86 | +```bash |
| 87 | +# First, install the dynamic library dependencies. |
| 88 | +dpkg -i ./lib-llm_1.4-m5stack1_arm64.deb |
| 89 | +# Then, install the llm-sys main unit. |
| 90 | +dpkg -i ./llm-sys_1.4-m5stack1_arm64.deb |
| 91 | +# Install other llm units. |
| 92 | +dpkg -i ./llm-xxx_1.4-m5stack1_arm64.deb |
| 93 | +# Install the model package. |
| 94 | +dpkg -i ./llm-xxx_1.4-m5stack1_arm64.deb |
| 95 | +# Note the installation order of lib-llm_1.4-m5stack1_arm64.deb and llm-sys_1.4-m5stack1_arm64.deb. The installation order of other llm units and model packages is not required. |
| 96 | +``` |
| 97 | + |
| 98 | +## Upgrade |
| 99 | +When upgrading, you can either upgrade the AI unit individually or upgrade the entire StackFlow framework. |
| 100 | +When upgrading a single unit, you can upgrade via an SD card or manually install using the `dpkg` command. It's important to note that for minor version packages, you can install the upgrade package individually, but for major version upgrades, all llm units must be installed completely. |
| 101 | +Command line upgrade package: |
| 102 | +```bash |
| 103 | +# Install the llm units that need upgrading. |
| 104 | +dpkg -i ./llm-xxx_1.4-m5stack1_arm64.deb |
| 105 | +``` |
| 106 | +[Device automatic upgrade installation.](https://docs.m5stack.com/en/guide/llm/llm/image) |
| 107 | +## Run ## |
| 108 | +Relevant AI services will automatically run at startup and can also be manually started via command. |
| 109 | +Check the running status of the sys unit: |
| 110 | +```bash |
| 111 | +systemctl status llm-sys |
| 112 | +``` |
| 113 | +You can refer to the systemd service commands for relevant commands. |
| 114 | + |
| 115 | +## Configuration ## |
| 116 | +StackFlow's configuration is divided into two categories: unit operation parameter configuration and model operation parameter configuration. |
| 117 | +Both types of configuration files use the JSON format and are located in multiple directories. The directories are as follows: |
| 118 | +``` |
| 119 | +/opt/m5stack/data/models/ |
| 120 | +/opt/m5stack/share/ |
| 121 | +``` |
| 122 | +## Interface ## |
| 123 | +StackFlow can be accessed via UART and TCP ports. The default baud rate for the UART port is 115200, and the default port for the TCP port is 10001. Parameters can be modified through configuration files. |
| 124 | + |
| 125 | +## Contribution |
| 126 | + |
| 127 | +* If you like this project, please give it a star first; |
| 128 | +* To report a bug, please go to the [issue page](https://github.com/m5stack/StackFlow/issues); |
| 129 | +* If you want to contribute code, feel free to fork it and then submit a pull request; |
| 130 | + |
| 131 | +## Star History |
| 132 | + |
| 133 | +[](https://star-history.com/#m5stack/StackFlow&Date) |
0 commit comments