Call for Collaboration:Research on Neural Encoding and Decoding Mechanisms and High-Precision Brain-Controlled Systems Based on the DIKWP Model and Artificial Consciousness
World Academy for Artificial Consciousness (WAAC)
International Standardization Committee of Networked DIKWP for Artificial Intelligence Evaluation(DIKWP-SC)
World Artificial Consciousness CIC(WAC)
World Conference on Artificial Consciousness(WCAC)
Email: duanyucong@hotmail.com
Directory
1. Background and significance
2. Research objectives and overall technical roadmap
3.1 Modeling of fine motor and language neural coding
3.2 Multimodal decoding model design
3.3 Development of high-precision brain control system
3.4 Research on brain-cognitive-model alignment mechanism
5. Phased results and assessment indicators
1. Background and significance
National strategic needs and technical background: The principle of cranial neural coding and decoding is the core scientific problem in the field of brain-computer interface (BCI), and it is also the key technical bottleneck to realize the integration of human-computer intelligence. How to "read" human purpose and cognitive information from complex and changeable EEG/neural signals and use them to control external devices or interact with artificial intelligence systems is a hot topic of international cutting-edge research. In recent years, there has been a breakthrough in artificial intelligence decoding technology: researchers have been able to decode the words and sentences heard by participants from brain activity under non-invasive conditions, with an accuracy rate of up to 73%. This marks an important step towards "non-invasive mind reading" to decode human language and consciousness. However, the existing methods are still limited to the preliminary results in specific contexts, and are far from the full precision and robustness required for practical use. On the one hand, brain signals are highly diverse and the weak noise ratio is low, which makes it difficult to generalize across subjects and tasks. On the other hand , there is a lack of system model support between high-level semantic purpose and low-level neural signals, and most of the current decoding adopts "black box" machine learning, which lacks a mechanistic understanding of brain cognitive processes, so it is often unable to meet new situations or require explanations. The National Key R&D Program clearly lists "cranial neural codec principles and technologies" as a priority direction, that is, we are required to develop new methods for brain information decoding that are both theoretically explainable and cross-modal adaptable, and break through the bottlenecks of existing brain-computer interfaces in multimodal fusion and real-time precise control.
The theoretical value of DIKWP × DIKWP interaction model: In response to the above challenges, this project introduces the original DIKWP artificial consciousness model and its "dual DIKWP interaction" framework by Prof. Yucong Duan's team, aiming to provide a new theoretical basis for multimodal brain information processing. DIKWP stands for Data-Information-Knowledge-Wisdom-Purpose (Data-Information-Knowledge-Wisdom-Purpose) five-level cognitive model, which adds the highest level of "Purpose" dimension to the classical DIKW (pyramid) model to emphasize the goal-driven characteristics of agent behavior. Unlike traditional bottom-up linear cognitive architectures, the DIKWP framework is designed as a mesh topology with fully connected interactions and multiple feedbacks. There is not only bottom-up information extraction and integration between the levels, but also top-down regulation and guidance, thus forming a cognitive closed loop of self-evolution. Specifically, the low-level data and signals are processed and upgraded to information, knowledge, and wisdom, and the high-level wisdom and purpose in turn guide new data collection and information selection, and realize cross-level feedback adjustment. This multi-path interaction mechanism ensures that the system can continuously accumulate new knowledge in a dynamic environment on the one hand, flexibly adjust the goal on the other hand, and continuously optimize its own behavior strategy. Therefore, the DIKWP model is naturally suitable to characterize the characteristics of brain multimodal information processing: parallel multi-pathway, hierarchical multi-scale, and closed-loop regulation。 For example, in the process of language generation, the brain does not convert semantics into sentences linearly, but involves the interaction of multiple layers of information such as vocabulary, syntax, and pragmatics. Through the cycle of the knowledge layer and the wisdom layer, the DIKWP model can make the language generation not only consider the low-level lexical data, but also combine the high-level purpose and background knowledge to achieve a generation effect that is more in line with the logic of human language, which is similar to the two-way integration of semantics and syntax by the language center in the human brain. For example, in motor control, the instructions of the motor cortex of the brain will be influenced by the modulation of cerebellum and sensory feedback to achieve fine coordination; The network structure of DIKWP can simulate this interaction of sensory-motor circuits, so that the motor neural coding model not only has a feedforward instruction pathway, but also a feedback correction mechanism, so as to more accurately describe the principle of fine motor control.
It is worth mentioning that Professor Yucong Duan further proposed the "DIKWP×DIKWP" interaction paradigm to describe the intelligent interaction and fusion across semantic space and conceptual space. To put it simply, it is to connect two DIKWP frameworks: one represents the external environment/semantic space (such as the information flow of external languages and visual scenes), and the other represents the internal cognitive**/conceptual space**(The state of the agent's brain). Through the mapping interaction of dual DIKWP, the human brain's understanding and response mechanism to external information can be simulated. For example, when the human brain understands a piece of text, there is a mapping from semantics to concepts (external language maps to internal knowledge) and from concept to semantics generation (forming answers based on internal understanding) – this is the process of mapping semantic space to conceptual space in both directions. At present, the research on this cross-space interaction mechanism is still relatively weak. This project introduces the DIKWP×DIKWP paradigm into brain-computer interface modeling to bridge the gap between brain signal patterns and AI semantics: on the one hand, mapping human brain neural activity (data/information level) to the large model semantic representation of artificial intelligence (knowledge/Wisdom level); On the other hand, based on the reasoning results and purpose of the AI model, feedback instructions that can be understood by the brain (data/information level) are generated. This two-way docking mechanism is expected to solve the problem of semantic gap in the current brain-computer interface and provide a new idea for the decoding of complex cognitive functions.
The necessity of introducing the theory of artificial consciousness: Under the framework of DIKWP, the "P" (Purpose/Purpose) at the top level actually corresponds to the content of the category of artificial consciousness. This project integrates the theory of artificial consciousness into the study of brain codecs and uses its interpretation of higher cognitive functions to guide neural signal analysis. Artificial consciousness focuses on how to realize cognitive processes similar to human consciousness in artificial systems, including self-model, purpose-driven, metacognitive and other elements. The introduction of these theories into our brain control system has the following important implications:
Mapping and analysis of higher cognitive functions: The theory of artificial consciousness provides a new perspective to understand the relationship between higher functions and neural activity, such as attention, purpose, decision-making, and state of consciousness. For example, models of consciousness such as the Global Workspace Theory suggest that certain globally synchronized patterns of neural activity in the brain correspond to conscious content. Drawing on such ideas, this project introduces the consciousness layer/wisdom layer into the DIKWP model to explain which neural signal patterns correspond to the user's purpose/attention, so as to improve the ability of the decoding algorithm to grasp the user**'s real purpose**.
Improve the robustness and generalization of decoding: AI decoders with human-like adaptive awareness can adjust the interpretation of the same signal according to changes in the environment and context, avoiding rigidity. Once the AI has metacognitive capabilities to monitor the uncertainty of its own decisions and report to its superiors, the system can proactively request more information or adopt a more robust decoding strategy when brain signals are noisy or new patterns emerge, thereby reducing the bit error rate. This kind of self-supervision is an important feature of artificial consciousness, which is expected to solve the problem of the sudden decline in the accuracy of current decoding algorithms in the face of new environments.
Enhance explainability and human-machine comprehensibility: The artificial consciousness framework emphasizes the explicit and modular internal processing, and each layer has clear functions. For example, in DIKWP, each layer of processing has clear semantics: the data layer extracts physiological signal features, the information layer identifies primary patterns, the knowledge layer makes rule reasoning, the Wisdom layer makes decisions, and the Purpose layer gives purpose constraints. Such a design makes the AI inference process transparent, and the intermediate steps can be extracted and reviewed, so as to achieve a "white box" interpretable AI. For BCIs, transparency in the decoding process means that doctors and users can understand why the system interprets brain signals the way they do, increasing trust in the system and safety of use. This is especially critical in medical scenarios to avoid the uncertain risks brought about by "black box AI" in life-related decision-making. According to reports, the DIKWP model has greatly improved the interpretability and transparency of AI systems due to its modularization and evidence-traceable inference process, ensuring that every step of decision-making can be traced in fields such as medical diagnosis. Similarly, we expect the brain control system to be structurally "understandable and understandable to humans", that is, the human brain and AI have a common semantic baseline to ensure that the decoding results are always within the comprehensible and controllable range of humans.
Provide theoretical support for new brain-computer interfaces: The combination of artificial consciousness and DIKWP is expected to build a new paradigm of purpose-driven brain-computer interfaces. Traditional brain-computer interfaces are mostly static mappings from signals to instructions, but with the guidance of the artificial consciousness framework, the system can have an initiative similar to "consciousness". For example, the system can activate relevant decoding pathways in advance based on the inference of the user's goals (e.g., the user focuses on analyzing signals from movement-related brain regions when the user wants to move the arm); When the user's attention is lost, the system can detect and pause operations, etc. This human-computer interaction greatly improves the intelligence level of the system and makes the brain control smoother and more natural.
In summary, this project will meet the national guidelines for "cranial nerve coding and decoding principles", and create a new path of multimodal brain information processing and decoding with the support of DIKWP× DIKWP model and artificial consciousness theory. It has great academic innovation significance: on the one hand, it fills the theoretical gap of cross-level and cross-modal brain cognitive modeling, and develops a new theory of interpretive brain decoding; On the other hand, it lays the foundation for engineering applications, promotes the next generation of high-precision brain control systems from concept to reality, and provides revolutionary technologies for the rehabilitation of disabled people, intelligent neuroprosthesis, human-machine synergy and other fields. At present, cutting-edge brain-computer experiments such as BrainGate have proved that the use of invasive electrodes can allow paralyzed patients to grasp objects or manipulate exoskeletons to walk through their minds, but there are still problems such as difficulty in signal interpretation and expensive and cumbersome equipment. Our research will strive to break through these bottlenecks with more intelligent algorithms and more systematic models, so as to seize the opportunity for China in the competition of the strategic highland of brain-computer interface.
2. Research objectives and overall technical roadmap
Overall goal: This project aims to build a closed-loop brain control system from cognitive purpose to neural coding to AI decoding to drive peripheral feedback, and form a complete theoretical and technical methodology system for encoding and decoding human advanced brain functions. Specifically, we will reveal the neural coding mechanism of cognitive purposes such as language and movement in the brain, develop a multimodal and high-precision AI decoding algorithm for brain signals, integrate and develop a prototype of a brain control system that supports multiple external devices, and explore the alignment method between human cognitive processes and artificial intelligence models. Through three years of research, we strive to theoretically enrich the cognitive computing model of cranial nerve encoding and decoding, improve the performance of brain-computer interface by an order of magnitude in technology, and promote the implementation of brain control technology in the field of rehabilitation medicine and human-computer interaction in application.
Overall technical route: Focusing on the above goals, we will follow the technical route of "basic mechanism research, model algorithm development, system integration verification", and break through key scientific problems and engineering difficulties in turn. Figure 1 summarizes the architecture and technical route of the closed-loop brain control system to be constructed in this project, and each module corresponds to the cycle of "cognitive purpose→ neural signal coding→ signal acquisition→ AI decoding→ device control and feedback" from the information flow
Cognitive Purpose: This is the starting point of a closed loop and represents the user's mental purpose or behavioral goal in a specific situation. For example, reaching for a cup, saying a word, or imagining a picture in your head. Based on the artificial consciousness model, we will define a "cognitive generation layer" that describes the user's high-level purpose, attention, and state of consciousness. This is equivalent to the Purpose and Wisdom parts of the DIKWP model, which contain the user's current goals, task understanding, and knowledge and strategies to be adapted in the task context. The cognitive generative layer is not directly derived from brain signals, but is modeled by designing experimental paradigms and utilizing artificial consciousness simulations. For example, the paradigm of subjects performing specific cognitive tasks (such as imagining movements, conceiving sentences) can be introduced, combined with questionnaires or behavioral reports, to determine the purpose content of subjects in different trials, and provide labeling basis for subsequent neural coding modeling.
Neural encoding (brain signaling layer): Cognitive purpose is encoded in the brain into a series of neural activity patterns, including firing sequences, local field potential oscillations, brain waves, etc. This project will focus on the neural coding mechanisms of language and movement, two typical functions, and extend the consideration of patterns such as visual imagery. Through multi-brain region and multimodal data collection, we extract neural coding features, such as the frequency band power changes of the motor cortex corresponding to the motor purpose, and the specific temporal and frontal activations of the temporal and frontal lobes corresponding to the language purpose. Here, the data layer (D) and information layer (I) concepts of the DIKWP framework are used to extract and fuse the original neural data at multiple levels, so that the components related to the high-level semantics are retained as much as possible and irrelevant noise is eliminated. We will construct a neural coding model of DIKWP interaction structure, that is, using the multipath network of the DIKWP model, the neural signals of different brain regions and different modalities are regarded as data streams on the network nodes, and the interactive processing and integration are carried out at the data-information-knowledge levels. For example, for language purpose encoding, the model will fuse the data of the auditory cortex (hearing the echo of its own internal language) with the information of the language center (word choice signal), and introduce the role of contextual memory through the knowledge layer nodes, so as to form an overall neural representation of the purpose of a sentence. The output of this module is: a "neural coding representation" that can be read by the AI model, corresponding to the mapping of cognitive purpose in the brain signal space.
Electrical Signal Acquisition and Transmission: In order to validate and apply the above coding models, we need high-performance brain signal acquisition devices and interfaces. This project plans to build a high-channel-count, multi-modal fusion brain signal acquisition platform, including non-invasive high-density EEG/MEG, functional near-infrared (fNIRS), and newly developed ultrasound brain imaging technology. Among them, ultrasound neuroimaging has the characteristics of high spatial resolution and high temporal resolution, which is expected to make up for the shortcomings of EEG/MEG and fMRI. We will explore the method of EEG+ functional ultrasound fusion to achieve a more comprehensive observation of neural activity by temporally synchronized and spatially registered data of different modalities. At the same time, we will use wireless transmission and embedded signal pre-processing technology to ensure that large-capacity data can be transmitted to the decoding algorithm module in real time and reliably. This part is equivalent to building the sensing and data layer infrastructure in the DIKWP model, providing a high-quality "data source" for the closed-loop system. The output of this link is: a stream of multi-channel neural signals that have undergone preprocessing and preliminary feature extraction, which is fed to the AI decoding module in real time.
AI decoding (artificial intelligence cognitive layer): This is the brain "readout" link of the closed-loop system, and the core is to develop a new multimodal brain signal decoding algorithm. The AI decoding module of this project will integrate the concepts of knowledge layer (K) and wisdom layer (W) of the DIKWP model, as well as the purpose understanding mechanism of the artificial consciousness system. Specifically, we plan to adopt a hierarchical decoding architecture: firstly, the deep learning model performs underlying pattern recognition (corresponding to the conversion from the information layer to the knowledge layer) of neural signals, and extracts primary decoding results such as motion direction and speech semantics. Subsequently, the knowledge reasoning ability of large models (such as large language model LLM or large vision model) is introduced, and semantic fusion and high-level inference (corresponding to the decision-making process of the Wisdom layer) are performed on the primary results. The role of artificial consciousness here is reflected in two aspects: first, we set up a Purpose inference submodule, which is equivalent to the "Purpose Layer (P)" of simulated artificial consciousness, which combines the context (such as the current task, environmental state) and the monitoring of the user's consciousness state, and infers that the decoded signal at this moment may correspond to the user's Purpose and generate a priori assumptions about the operational objectives; Second, metacognitive control is embedded in the decoding process, that is, the algorithm evaluates its confidence in the decoding result, and when the confidence is low, it can request data for a longer window or start an alternate decoding path (such as switching decoding model parameters), so as to improve the overall reliability. The output of this module is a machine-readable understanding of the user's purpose, such as recognizing that the user wants to perform the action of "grabbing a cup" at the moment, or recognizing that the user's brain is internally reciting the phrase "a glass of water". It is worth emphasizing that our multi-modal decoding model design pursues low latency and high accuracy: by optimizing the model structure and inference algorithm, the delay of decoding from signal acquisition to output result is controlled within hundreds of milliseconds, which meets the requirements of real-time closed-loop control.
Peripheral Control & Feedback: This is the execution and closure of a closed-loop link. According to the user Purpose decoded by AI, this module will convert the Purpose into a specific control command and send it to the target external device, and process the feedback information of the device to pass it to the user, so as to form a human-machine integrated control loop. For example, for a peripheral such as a robotic arm, when the "grabbing" purpose is decoded , the control module needs to generate the motion instruction sequence of each joint of the robotic arm to complete the grasping action; During the execution process, the sensors of the robotic arm will collect feedback such as grasping force and object touch, which can be communicated to the user through visual/tactile devices on the one hand (such as displaying the status of the robotic arm through VR glasses or providing tactile feedback to the user's skin through stimulation devices), and on the other hand, it is also fed back to the AI decoding module for correcting subsequent decoding (for example, if the grasping failure is detected, the decoding module may have made an error). In this way, the whole system forms a closed loop: the user's purpose affects the brain signal, the brain signal is decoded by AI to control the device, and the device feedback affects the user's cognition, thus entering the next cycle. We will use the knowledge update and purpose adjustment mechanism in the DIKWP model to optimize the closed-loop control performance—for each round completed, the system will update the internal knowledge base (e.g., success/failure experience) and adjust the future action strategy accordingly (Wisdom layer), and the user's goal layer may also change (e.g., goal switching when the task is completed). Through multiple cycle training, it is expected that both humans and machines can achieve synergy: the user can more stably generate brain signals that are easy to interpret by the machine, and the machine can gradually learn the user's characteristics and achieve proficient cooperation.
In summary, the overall technical route is based on the DIKWP interaction structure, which integrates the cognitive purpose layer, neural coding layer, decoding decision layer, and device execution layer into a closed-loop system. Among them, the DIKWP model runs throughout: from the initial brain data collection to the final Wisdom decision feedback, every step is semantic localization and interaction under the DIKWP framework. This design allows us to clearly define the research task: first, at the basic level, we can figure out "how cognitive purpose is encoded as brain signals" (research content 3.1), and then at the model level, "how to efficiently decode brain signals in multiple modalities" (research content 3.2); Then, at the system level, "how to drive complex peripherals with decoding results" (research content 3.3); Finally, at the level of human-computer integration, "how to align the semantics of human brain cognition and AI model" (research content 3.4) was explored. This route not only ensures the logical connection between the various research contents, but also lays a phased technical foundation for the final system integration.
3. Key research content
Focusing on the above technical routes and goals, we have identified the following four directions as the key research contents of this project:
3.1 Modeling of fine motor and language neural coding
Objectives: To elucidate the neural coding mechanism of human fine motor (such as fingers, wrist joints, etc.) and language (internal language planning and speech) functions in the brain, and to construct a unified neural coding representation structure based on the DIKWP × DIKWP interaction model. We will answer: How does the brain encode a specific cognitive purpose (wanting to accomplish an action or express a sentence) into a pattern of neural activity? What is the relationship between the data layer signals, information layer features and higher-level cognitive semantics in these patterns? Is there a coding pattern that is consistent across individuals and tasks that can be generalized?
Research content and program:
Multimodal Brain Signal Acquisition and Experimental Paradigm Design: We will design different task paradigms to obtain brain data for the two major functions of motor and language. For example, in the motor coding study, subjects were recruited to perform specific hand movements (such as index finger and little finger flexion and extension, palm grasp relaxation, etc.), and data such as high-density EEG, MEG, and invasive cortical electrodes if necessary (such as cooperation opportunities in patients with clinical epilepsy) were recorded to obtain the corresponding motor cortex activation patterns. In language coding studies, participants are asked to read/listen to sentences and repeat them in their heads, or imagine saying specific sentences, and record the corresponding activity in the brain's language areas. For the complement of visual modalities, we also consider visual imagery tasks in simple scenes (such as imagining the shape of a specific object) to obtain signals from visual areas such as occipital lobes. Through multimodal acquisition, we ensure that the data dimensions cover information with complementary spatiotemporal resolutions: EEG/MEG provides millisecond-level temporal detail, and fNIRS/functional ultrasound provides millimeter-level spatial distribution, enabling a comprehensive view of neural coding phenomena.
Neural coding analysis of DIKWP framework: In terms of data processing, we introduce the concept of DIKWP model to perform hierarchical decomposition and interactive analysis of the neural coding process. The specific method is as follows: firstly, the original signal is regarded as the input of the data layer (D), and the signal processing and machine learning methods are used to extract preliminary features (such as frequency band power spectrum, neuron firing rate, etc.), which is equivalent to the information layer (I) representation; Then, the association between these features and the cognitive variables designed in the experimental paradigm is explored, in an attempt to further elevate the features to the knowledge layer (K), that is, to find the structured pattern. For example, to analyze whether different finger movements in motor tasks correspond to different spatial patterns in the EEG spectrum, such as the difference in the local inhibition range of μ rhythm; To analyze whether words of different semantic classes (nouns, verbs, etc.) elicit distinguishable patterns in brain signals in language tasks. This step is equivalent to refining "neural semantics" at the knowledge level**: finding the correspondence between decodeable neural features and specific cognitive meanings. We will use a combination of statistical analysis (e.g., EEG power topology map difference test), encoding model**, and decoding model: on the one hand, we will build a prediction model from stimulus/task to brain signals to test hypothesis, and on the other hand, we will train a decoder to predict cognitive state from brain signals to test the discriminant power of these features.
Multipath interaction coding modeling: The advantage of DIKWP is multi-level multipath interaction. Based on this hypothesis, we hypothesize that the brain coding of fine motor and language is not isolated from each other, but rather intersects and merges, for example, language has a motor component (vocalization involves tongue and throat movements), and that movement is also regulated by language (through verbal instructions). We will build a DIKWP ×DIKWP interaction coding model that treats movement and language as two DIKWP chains that interact at certain levels. For example, in the Wisdom layer (W) interaction: when the participant performs language-related actions (such as pronunciation, action, imagination), the language content (semantics) and the motor schema are synchronously activated in the brain and affect each other, and the knowledge/Wisdom node of the DIKWP model should contain the integration of the two aspects of information. This model will realize cross-functional neural coding representation by jointly analyzing the data of motor tasks and language tasks. Specifically, we plan to train a multi-task deep neural network in which some of its parameters are shared for learning generic brain signal features (underlying data/information representations), and the other part is for motor and language task outputs (high-level semantic decoding), respectively. Through this shared-specific hybrid structure, we hope to capture the commonalities and personalities of motor and language coding: the commonalities may correspond to general attention, working memory, and other modulations (which should be common to the Wisdom layer), and the personality part corresponds to the specific coding patterns of the motor and language cortex. During the training process, we will introduce the Purpose layer prior, that is, to instruct the model to decode the current main decoded motion or language according to the experimental task, so as to simulate the selective opening of the DIKWP Purpose layer to the underlying pathways. Model performance will be evaluated by the improved decoding accuracy of multi-task and the ability to transfer between tasks: for example, whether some model parameters trained on a language task can accelerate the learning of a motor task and vice versa. This will validate our hypothesis about the motion-language encoding interaction.
Generalizable neural coding representation: With the above model, we will refine a unified neural coding representation structure with the potential for generalization across subjects and modalities. This may take the form of a template spatial feature or a domain-invariant representation. For example, the positional features of the relative cortex functional areas were extracted in the motion coding, so that the differences between different subjects were reduced. Extract semantically relevant network connection features in language encoding, rather than individual electrode waveforms, etc. Finally, it is hoped to obtain a high-dimensional representation vector or tensor, and its dimensions correspond to some interpretable neural modules (such as "Purpose X activation intensity", "language semantic Y activation intensity", etc.). This representation is the input of the project's subsequent decoding algorithm and the bridge to align the AI model.
Expected Results: Through this study, we will obtain new theoretical discoveries and data support for cranial neural coding. In terms of movement, it may reveal for example, the fine distribution of cortical topology of single finger movements, and the co-coding patterns of multi-finger combined movements; In terms of language, it may reveal the different coding pathways of the brain for nouns, verbs and other category words, as well as the differences and connections between internal language (imaginary language) and external auditory language. If the research goes well, we will publish 1-2 papers on the coding mechanism of motor and language brains, and form a set of general neural coding models to provide a direct basis for the development of decoding algorithms. Importantly, we will accumulate a multimodal large-scale brain signal dataset and grasp the implicit "data-information-knowledge" transformation path to verify the effectiveness of the DIKWP model in explaining the cognitive processes of biological brains.
3.2 Multimodal decoding model design
Research Objectives: To develop a multimodal AI decoding model of brain signals combined with artificial consciousness Purpose reasoning, to achieve low-latency and high-accuracy decoding of multiple types of brain activities such as movement, language, and vision. The model should be able to map the neural encoded representations extracted in Section 3.1 back to the corresponding semantic commands or contents, such as restoring brain signals to the sequence of hand movements imagined by the user, or decoding sentences that the user hears/thinks. We pursue a unified decoding framework that is compatible with multiple input signal modes and interprets multiple forms of Purpose outputs. At the same time, the human-aware Purpose layer and Wisdom layer logic are introduced to make the decoding process context-aware and adaptively adjustable, so as to improve the decoding accuracy and reduce latency and false positives.
Research content and program:
The overall architecture of the multimodal decoding model: We design a hierarchical and progressive decoding architecture, referring to the middle and high levels of the DIKWP model: firstly, the perception decoding layer decodes the low-level features of the brain signal into preliminary recognition results (corresponding to the information layer → knowledge layer conversion of DIKWP), and then the cognitive decoding layer combines context and knowledge to make semantic interpretation and reasoning decisions on the preliminary results (corresponding to the knowledge layer → Wisdom layer transformation), and finally The Purpose decision-making layer produces the final output based on the inference results and the estimated user Purpose (corresponding to the Wisdom layer → the Purpose layer). For example, for language decoding, the perception layer may first output a string of candidate words or phoneme probabilities, the cognitive layer may use language model knowledge to spell out possible sentences on these basis, and the purpose decision-making layer selects the sentence that best meets the user's purpose as the output based on the context of the current topic.
**Semantic Decoding of Converged Large Models (LLM/CV): At the cognitive decoding layer, we will leverage the powerful prior knowledge and semantic representation capabilities of today's large pre-trained models. Specifically, for language decoding, we introduce pre-trained large language models (such as the GPT series) that take the representation vectors extracted from brain signals as cues (**prompt) or conditional input, so that the language model can produce the corresponding text. Recently, it has been proved that the vectors decoded by fMRI can be directly fed into the pre-trained language model for autoregressive text generation, which can generate sentences that are closer to the original semantics without the limitation of the candidate set. We will draw on this idea: use the neural encoded representation obtained in our section 3.1 as a conditional embedding, access the Chinese large language model (considering that the application population mainly speaks Chinese, we will give priority to selecting or training the Chinese language model), and let it generate text output. This method is expected to break through the limitations of traditional classification/matching methods and realize the free generation of open vocabulary. Similarly, in terms of motion decoding, we consider introducing large models for high-level planning, such as combining reinforcement learning pre-training models to score or correct the decoded preliminary motion sequences to ensure that the output action sequences are physically coherent and semantically consistent with the purpose. In terms of visual purpose decoding (e.g., decoding images imagined by the user), generative vision models (e.g., diffusion models, generative adversarial networks) can be used to convert brain signal representations into images – not the focus of this project, but we will follow the latest developments to determine feasibility.
Purpose Layer Integration and Artificial Consciousness Unit: In order to integrate the advantages of artificial consciousness into decoding, we specially designed the Purpose Inference Module。 The module is located at the top layer of the decoding architecture (corresponding to the DIKWP Purpose layer P), and its input is a number of possible outputs and confidence levels provided by the cognitive decoding layer, and its output is a high-level inference of the user's current overall purpose and the selection and regulation of the output results. This can be achieved by using a sequential decision-making model (such as a Transformer-based policy network), which takes the context of the current moment (such as previously recognized content or previous motion control state), the internal state of the AI model (such as the hidden state of the language model), and the aforementioned AI output candidates as inputs, and generates a "Purpose state vector" through inference. This vector can be understood as the AI's estimation of "what the user really wants to achieve". For example, during a continuous interaction, the Purpose state vector might encode the message "The user is trying to complete the water cup task". With this Purpose vector, we evaluate how well the current AI candidate output matches it, and finally select the output with the highest score as the final execution/presentation result of the system. If the match is not high, it may trigger the re-evaluation of the AI decoding layer (equivalent to metacognition prompting it to "what was decoded just now may be wrong, try again"). This Purpose inference module is equivalent to installing a "thinking brain" on the AI decoder, so that it is not limited to guessing the result directly from the signal one by one, but has a global task awareness to guide the decoding. This also mimics the process by which executive control areas such as the frontal lobe in the human brain regulate the interpretation of sensory information according to the current goal.
Low latency optimization and real-time assurance: We will focus on reducing latency in model design and implementation. On the one hand, the hierarchical architecture can be used for parallel flow processing: the perception layer outputs part of the results in real time for subsequent use, without waiting for the end of the complete sentence; On the other hand, the use of lightweight large models (or distilled models) combined with dedicated hardware acceleration (such as GPU/FPGA) to optimize the time-consuming links such as language model inference to the order of tens of milliseconds. In addition, we will investigate streaming decoding techniques, in which the model can output incremental results at any time in the case of continuous brain signal input (e.g., words are output while listening to brain signals, as is the case with speech input methods). We plan to achieve this through a sequence-to-sequence model with closed-loop feedback training, so that the model learns to update the previous output when it gets a new brain signal instead of overturning it entirely, so as to maintain coherence and low latency while ensuring accuracy.
Model Adaptation and Continuous Learning: Considering the individual differences and time-varying characteristics of brain signals, we will make the decoding model have certain online learning and adaptive capabilities. On the one hand, a transfer learning strategy is introduced: the model is trained on a large amount of general data in advance, and then a small amount of rapid fine-tuning is done for an individual user to adapt it to the signal characteristics of that user (e.g., using the user's calibration data fine-tune decoder). On the other hand, closed-loop self-calibration is implemented: the model parameters are updated using user correction in closed-loop feedback (e.g., user feedback decoding errors through other means). We can even monitor biofeedback such as error-dependent potentials (ErrP) in brain signals: when the AI output does not meet the user's expectations, a specific ErrP waveform often appears in the user's brain, and the system detects it and determines that the last decoding error was corrected and adjusts it. This allows the decoder to be used more accurately and less frequently manually calibrated over long periods of time.
Expected Results: The research content will deliver a multimodal brain signal decoding software/module. We plan to achieve the following indicators in the experimental verification: for a single type of task (e.g., hand motion imagination classification at rest), the decoding accuracy is improved by >20% compared with the existing methods; For comprehensive scenarios (such as tasks that include both motion and language purposes), the decoding model can automatically distinguish and identify different categories of purposes, with an accuracy rate of more than 80% and a delay control of less than 200ms, which is significantly better than the traditional non-conscious model. In the language decoding test, we expect to achieve the ability to decode consecutive sentences from non-intrusive signals, such as recovering the general idea of the brain signals heard by the participants when listening to the story, and the content matching degree is more than 50% in the limited short story scenario. It is worth mentioning that the latest MindLLM model in the world has successfully decoded the fMRI signal to text. We will evaluate these results against each other to demonstrate the benefits of the introduction of DIKWP and artificial awareness in this project. In addition, we will produce more than 2 academic papers on multimodal decoding algorithms and purpose fusion methods, which will be published in top conferences/journals in neural engineering or artificial intelligence.
3.3 Development of high-precision brain control system
Research Objective: Based on the above coding and decoding results, a high-precision real-time brain control system will be developed in an integrated manner. The system should support at least 5 different types of external devices, including but not limited to: robotic arms, exoskeleton robots, smart wheelchairs, prosthetic hands, and drones, etc., so that users can manipulate high-degree devices with a degree of near-natural finesse through brain signals. We will focus on solving engineering challenges such as stability, real-time, and security in system integration, and integrate the state evaluation and adaptive module of the DIKWP model to ensure that the system can evaluate and adjust the user's and its own state in real time during continuous operation to maintain high-precision control. In addition, we will explore the possibility of parallel control of multiple peripherals to lay the foundation for multi-task brain control.
Research content and program:
The overall architecture of the brain control system: In terms of software, we have developed the brain control center platform software, which integrates the AI decoding module and multi-device interface module of Section 3.2. The platform follows a modular design: it includes a signal processing module, a decoding decision module, a device mapping module, and a monitoring and evaluation module, etc., and communicates with each other through a publish/subscribe mechanism to ensure scalability and reliability. In terms of hardware, we have set up a brain-controlled workstation, equipped with a high-speed data acquisition card, GPU computing unit and various wireless communication units, so that it can be connected to EEG caps, exoskeleton drive motors, robotic arm controllers and other peripherals at the same time. In terms of network topology, a centralized-distributed combination is adopted: brain signal decoding and high-level decision-making are centralized at workstations, while underlying device control is distributed by embedded controllers and connected through low-latency bus/wireless connections.
Support the development of control interfaces for multiple types of peripherals: We will design corresponding instruction sets and mapping strategies for the control requirements of different types of peripherals. For example, for the robotic arm (high degree of freedom mechanical actuator, generally 6~7 joints), we define the brain control instructions including the end position increment or specific grasping action, etc., and then convert them into the angle commands of each joint by the inverse kinematics module; For exoskeleton lower limb robots (used for walking aids), brain-controlled commands can be defined as simple high-level commands such as "forward", "stop" and "turn", and then the robot's own gait planning executes the details; For prosthetic hands, instructions may involve flexion and extension of each finger or several preset gesture switching; For smart wheelchairs, the instructions include direction and speed adjustment; For drones, the commands include basic flight controls such as up/down/forward/stop. We will ensure that at least 5 peripheral protocols are connected, and each implements a set of brain control interface libraries. Due to the different response characteristics and safety requirements of different devices, we will set up a safety checksum arbitration mechanism for each type of device: for example, the robotic arm and exoskeleton need to prevent injuries caused by too fast movement, so buffers are added when the human brain emits violent abnormal signals; UAV control needs to consider communication packet loss, so set up mechanisms such as automatic hovering for packet loss.
DIKWP state evaluation and adaptive module: In order to ensure high accuracy and safety, we introduce the real-time state evaluation module of the DIKWP model as the "brain steward" of the system. The module continuously monitors three aspects of status:
1) User status: including the user's brain signal quality, concentration degree, fatigue degree, etc., fatigue can be judged by analyzing the degree of theta wave inhibition in EEG, and attention transfer can be judged by analyzing electroocular electrophrasal artifacts. When a user's poor state (such as inattention) is detected, the high-risk action can be postponed and the user can be prompted to rest.
2) Decoder status: Monitor the performance indicators of the AI decoder module, such as the recent output entropy value (uncertainty), error rate evaluation (which can be judged by comparing the feedback from the device sensor with the expected result), etc. If the decoding confidence level is found to be declining or there are continuous errors, the module can decide to trigger automatic calibration: for example, by calling the system's built-in calibration sequence (allowing the user to imagine several known instructions) to recalibrate the decoding model, or to dynamically adjust the decoding parameters;
3) Equipment status: Monitor the working status of peripherals (temperature, current, execution accuracy, etc.) and the interaction with the environment (such as whether the grip force of the robotic arm exceeds the limit) through sensors, and evaluate whether there are risks or deviations in combination with the knowledge base. Based on the above monitoring, the status assessment module acts as a Wisdom layer to manage the operation of the whole system: it does not intervene under normal circumstances, and once an abnormal trend is detected, it adaptively maintains the stable and high-precision operation of the system by adjusting parameters, downgrading or issuing warnings. This continuous self-test and self-adjustment function is one of the highlights of our system that distinguishes it from previous rigid brain control systems.
High degree of freedom fine control algorithm: to achieve high-precision control, the brain control instructions are not only correct, but also continuous smooth and fine-tuned。 To this end, we will introduce intelligent smoothing and optimization algorithms in the mapping process from the decoded result to the device instructions. For example, for the joint angle control of the robotic arm, the discrete target position sequences decoded directly from the brain signal may be used, and we will use the optimal control or deep reinforcement learning algorithm to optimize these sequences under the premise of satisfying the user's purpose, so that the joint movement trajectory is smooth and obstacles are avoided. For example, for prosthetic gesture control, we may use collaborative control: combined with auxiliary signals such as electromyography, the brain's rough instructions + local feedback can be finely adjusted, so that the finger strength can be accurately controlled. We also plan to add fault-tolerant control to the device side, for example, when the brain signal is transiently disturbed and the instruction is abnormal, the device controller can automatically dampen according to the nearest purpose and physical limit to prevent dangerous actions. These measures are designed to improve the accuracy and safety of the overall control.
Parallel control of multiple peripherals and task collaboration: As a cutting-edge exploration, we will try the brain control of one person and multiple machines, that is, the user controls multiple devices to complete the task through brain signals at the same time. This is a huge challenge for both brain decoding and interface mapping: decoders need to be able to distinguish between parallel purposes or switch between purposes quickly in time. Therefore, we will preliminarily choose a relatively simple scenario test, such as controlling the movement of the wheelchair and the grasping of the robotic arm at the same time—the corresponding brain signals may be used to imagine the left-hand movement command to control the wheelchair steering, and the right-hand movement may be used to control the robotic arm. On the interface, we build a task management layer that allows the user to activate different control modes through specific brain signal patterns (similar to switching "control channels"), or the AI Purpose module automatically determines which device the user is currently focused on. Although this is a difficult scaling goal, exploring its feasibility will lay the foundation for more complex human-robot collaboration in the future.
Expected Results: This research will result in a prototype of a brain-controlled system that demonstrates high-precision brain-controlled operation of multiple devices in a laboratory environment. The assessment indicators include:
Control accuracy: For typical equipment tasks (such as robotic arm grabbing small objects, exoskeleton assisted walking), the success rate needs to reach more than 80%, and key parameters such as the positioning error at the end of the robotic arm < 2 cm, and the exoskeleton gait synchronization delay < 0.2 seconds.
Real-time performance: The total delay from brain signal acquisition to device response < 300 ms (including 150 ms for decoding <algorithms, 50 ms <for network transmission, and the rest for device execution), and the data refresh rate of each module processing frequency reaches more than 100 Hz, ensuring that users have no obvious sense of delay.
Robustness: The system can run continuously for > 1 hour without interruption without manual recalibration; It can maintain >90% function even when a certain amount of noise is introduced or the user's attention fluctuates, and there is no dangerous misoperation. Record the number of triggers for each automatic calibration, which should be less than 1 time every 30 minutes.
Versatility: Successfully adapt to ≥ 5 kinds of peripheral devices, and complete at least 2 different functional tests for each of them. For example: robotic arm (to complete the two types of actions of "pouring water" and "writing"), prosthetic hand (to complete the two types of actions of "pinching the ball" and "holding the paper cup"), etc. When switching between different devices, only the output interface configuration needs to be changed, and the core decoding algorithm needs to be changed, proving the universality of the system.
User experience: A small number of healthy volunteers and target users (such as 1 amputee and 1 paraplegic) were invited to participate in the trial and subjective feedback was collected. The goal was that most of the participants said that the control experience was natural, that the learning time was < 30 minutes, and that they were willing to use it repeatedly. We will pay special attention to the mental load of users during use, and strive to minimize it through intelligent system assistance.
For example, compared with 2019, when a French scientific research team used quadriplegic patients to use invasive electrodes to control the exoskeleton to walk and touch objects, our system is expected to achieve similar functions and be more flexible under non-invasive conditions; For another example, compared to BrainGate, which uses a brain-computer interface to control the robotic arm to pick up and drink water, the system will add more autonomous intelligence, reduce the frequency of calibration and improve safety. Ultimately, we hope to establish China's leading position in the development of multi-degree-of-freedom brain control systems, and lay the foundation for subsequent clinical and industrialization.
3.4 Research on brain-cognitive-model alignment mechanism
Research Objectives: To explore the alignment mechanism between the cognitive representation of the human brain and the representation of the artificial intelligence model, construct a human-machine cognitive channel, and realize the semantic mapping between the knowledge/semantic space within the large model and the cognitive content carried by the brain signal. Through this study, we hope to answer: Can the representation of large-scale pre-trained AI models be used to better interpret and decode brain activity? Conversely, can cranial nerve signals provide unique cognitive constraints for AI models, so as to establish a two-way human-machine cognitive coupling system? This content will provide a theoretical basis and experimental verification for the information theory limit of brain-computer interface and brain-computer cooperative intelligence.
Research content and program:
Representation extraction of the cognitive layer of the large model: First, we need to obtain the representation of the corresponding "cognitive layer" in the large AI model. Taking large language models (LLMs) as an example, their high-level implied state vectors often carry rich contextual and semantic information, which can be regarded as the "idea" of the model. For example, when the GPT model processes a sentence, each layer of the Transformer has a hidden state, and we will focus on intercepting the hidden state of the last few layers as the "Wisdom layer" representation of the model. For the visual-linguistic multimodal model, we can take its representation of the cross-modal fusion layer, which is considered to contain a comprehensive understanding of the concept. This project will select appropriate AI models for tasks such as movement and language: in terms of language, we will select the open-source LLM with the best performance in Chinese, and in terms of vision/movement, we can choose some models with action semantic understanding capabilities (such as pre-trained video understanding models) or our self-trained models. By having these models process the same/similar task inputs as our experiments (e.g., the model reads the same text we play to the subject, or the model observes a video of the participant performing an action), the representation vectors inside the model are extracted as the coordinates of the AI cognitive space.
Mapping of brain signal space to cognitive space: The next core work is to learn a function f that maps the representation of brain nerve signals to the cognitive representation of the AI model, and possibly the inverse mapping g. This can be seen as looking for an alignment transformation between the brain space and the model space. On the one hand, we plan to use the method of aligned learning and representation mapping: on the one hand, we use samples with both brain data and corresponding semantic labels (e.g., after listening to a sentence, brain signal X corresponds to sentence semantic S, and the AI model obtains semantic representation V), to train a feedforward neural network f: X->V to minimize the gap between f(X) and V (e.g., mean square error); On the other hand, methods of zero-shot mapping are also explored, such as direct geometric alignment: it has been found that the semantic embedding space in the brain and the embedding space of the language model have similar patterns in geometric structure, and can be well coincident through linear mapping. We will try to perform principal component analysis or network training on the brain responses (e.g., neural embeddings of a series of words) and the word vectors of the language model to see if we can find a simple transformation to align the two points to prove the conformal geometry hypothesis. In addition to linearity, we will also introduce the idea of Deep Adversarial Network (GAN), where a mapper projects brain signals into the hidden space of the model, and the discriminator determines whether the projection is like the hidden vector of the real model, and the loop training approximation distribution is consistent.
Semantic alignment evaluation: To verify the alignment effect, we design several evaluation methods:
1) Retrieval matching accuracy: given a piece of brain signal (such as the signal of a sentence being listened to), the model space vector is obtained by mapping f, and then the most similar text vector is retrieved in the model corpus to see whether the first N candidates contain the correct sentence. This is similar to the zero-shot mapping validation method used in the previous Nature newsletter paper, which will prove whether brain embedding and model embedding share patterns.
2) Decoding task performance: The mapped model representation is directly fed into the downstream tasks of the pre-trained model, such as Q&A or translation, to see if the output conforms to the meaning that the subject wants to express. For example, if a participant looks at a picture and uses their brain to describe it, we map the brain signal to the language model space, and then let the language model generate the description text to evaluate whether it is close to the human description.
3) Brain-computer collaborative task: design a task that needs to be completed by humans and AI, for example, AI generates a sentence at the beginning, and the user continues to think about the end in his mind, we decode the end and the AI determines whether it conforms to the semantics; Or ask the AI to guess which one the user is imagining in a number of images. Through the success rate of these interactive tasks, the usefulness of aligning channels is evaluated.
Construction of human-computer two-way cognitive channel: On the basis of completing the brain-> model mapping, we also try to build a model->brain reverse pathway, that is, a closed loop of brain-computer collaboration。 If the mapping g: V->X can be learned (e.g., using neurofeedback training, where the AI model adjusts the output to drive the user to produce a specific detectable brain signal pattern), it means that we can partially "write" information to the brain. We will be very cautious about this and limit ourselves to non-invasive, low-risk attempts, such as using visual/auditory feedback to get the user's brain into a known pattern. If the AI is able to adjust the presentation based on the monitored brain feedback (e.g., if the AI tries to synchronize a certain rhythm with the user's brain or causes the user to produce a specific resonant brainwave state), that would validate the possibility of bidirectional alignment. Of course, the main focus of this project is still on one-way decoding alignment, but we believe that reverse exploration will provide valuable clues for future Brain-AI symbiotic technologies.
Privacy and ethical considerations: Brain-model alignment raises potential concerns about brain privacy, as it may dig deeper into the semantic information contained in the brain. This study will be conducted under the premise of adhering to ethical norms, such as all participants signed informed consent, data anonymization, and analysis will only be done for specific task areas, and no personal intimate associations will be involved. We will also technically study the idea of cryptographic mapping, so as to avoid the misuse of the alignment model and restore the ideas that users do not want to share. These explorations will help establish privacy standards in the field of brain-computer interfaces.
Expected Results: Through this content research, we hope to achieve the first domestic proof of the deep alignment of the semantic space of large models with the cognitive space of the human brain. Specifically, expected accomplishments include:
**Algorithm module: A semantic alignment algorithm module (software) that inputs signals such as EEG/MEG, outputs the corresponding large model hidden layer representation vector, and reversible coarse coding. We strive to make the module reach the international advanced level in testing on public datasets, such as achieving top results in public cross-modal brain-computer semantic alignment tasks. It is worth noting that some scholars have published the Chinese Semantic Aligned EEG Dataset (**ChineseEEG) for semantic alignment and neural decoding research. We will verify the effectiveness of the proposed algorithm on this dataset and expect to improve the accuracy by >10% compared with the existing methods.
Theoretical papers: Write and publish 1~2 high-level papers, systematically expound the implementation methods and experimental results of brain-cognitive-model alignment, analyze which dimensions of semantics are easy to align, which are difficult, and which layers of the large model are more similar to the human brain. These results will have an important impact at the intersection of cognitive neuroscience and artificial intelligence.
Human-machine collaboration demonstration: show a simple example of human-computer semantic communication: for example, ask the subject to look at several pictures, describe one of them with brain imagination, and our system will decode and generate a text description, and the AI assistant will guess which picture it is. This demonstration will demonstrate the prototype of a new interaction model based on human-machine alignment.
Through this study, we will prospectively expand the connotation of brain-computer interface, from simple signal control to the integration of brain and AI knowledge. In the long run, the human brain and the AI "brain" will achieve deep synergy through the alignment channel we have laid, which will not only help the performance of brain-computer interfaces leap forward, but also provide valuable clues for understanding the unified mechanism of human and machine intelligence.
4. Feasibility analysis
The project is led by Prof. Yucong Duan, whose team has a solid research foundation and rich resources in the fields of brain-computer interface, artificial intelligence and artificial consciousness, which provides a reliable guarantee for the smooth implementation of the project.
Research basis: Prof. Yucong Duan's team has long been committed to the proposal and improvement of the DIKWP model and artificial consciousness theory, and is an international leading team in this field. As the chairman of the World Conference on Artificial Consciousness and the director of the DIKWP International Standards Committee for Artificial Intelligence Assessment, Professor has led a number of research projects on artificial consciousness and has built a complete theoretical system. In terms of artificial consciousness hardware architecture, the team proposed the artificial consciousness processing unit (ACPU) architecture, which organically combines the subconscious computing unit, the conscious decision-making unit and the fusion unit, and realizes the prototype of artificial consciousness from theory to hardware design. This innovative work demonstrates the team's ability to translate complex consciousness theory into engineering implementation, which is an important precursor to the design of the brain control system (especially the consciousness module) of this project. At the same time, the team has also accumulated rich experience in the specific application of the DIKWP model: for example, the DIKWP model is used in the medical dialogue system to solve the problem of diagnostic disagreement, which significantly improves the system interpretation ability; Aiming at the problem of large model hallucination, the combination of DIKWP and chain thinking and other reasoning paths was explored. These studies provide inspiration and reference for us to apply DIKWP to brain signal decoding.
In the direction of brain-computer interface and neural engineering, the team has also made a number of layouts and achievements in recent years. The project led by Prof. Yucong Duan has proposed the idea of integrating the DIKWP artificial consciousness model with a high spatiotemporal resolution ultrasound brain-computer interface to break through the bottleneck in the field of BCI. The idea has been preliminarily demonstrated, and a call for cooperation has been published on the scientific research network, indicating that the team is at the forefront of exploring new brain signal acquisition and decoding technologies. In addition, team members include experts with multidisciplinary backgrounds in computational neuroscience, pattern recognition, robotics engineering, and more. Among the core research backbones, there are not only PhDs with rich experience in EEG/brain imaging data analysis (who have participated in a national key R&D program brain science project), but also young talents who are good at machine learning and large model applications (who have published papers on the association between large models and brain activity at the top meeting), and engineers who are proficient in embedded systems and robot control (who have participated in the research and development of medical robot products). Such a multidisciplinary team configuration enables us to collaborate on key problems from multiple perspectives such as brain science, artificial intelligence, and electronic engineering. The team has built a basic laboratory platform: equipped with a 128-channel EEG acquisition system, a functional near-infrared imager, virtual reality equipment, and a variety of peripheral prototypes such as robotic arms, myoelectric prosthetic hands, and lower limb exoskeletons, which initially meet the experimental needs of this project. Relying on this platform, the team has successfully demonstrated the experiment of using EEG signals to control a simple robotic arm to grasp the sphere, and won good results in the National College Student Brain Control Competition. Although these preliminary explorations are relatively simple, they provide valuable experience for us to carry out more complex brain control research (such as multi-channel signal noise reduction, triggering mechanism of asynchronous brain control commands, etc.).
Innovation ability: The team's innovation ability in theory and practice is outstanding, and the relevant papers published have been selected as hot spots for many times. In particular, new ideas such as "consciousness bug theory" and "relational definition semantics" proposed by Professor Yucong Duan have attracted attention in the academic community. In terms of AI assessment, the team developed the world's first large-scale model awareness level assessment method and released a white-box DIKWP assessment report to fill the gap in this field. These achievements demonstrate the team's ability to accurately grasp the frontier of science and technology and lead the way. For this project, we have the driving force from 0 to 1 innovation, and we are confident that we will integrate the new DIKWP × DIKWP ideas into the brain-computer interface, and realize the world's first artificial consciousness-guided brain control system. Initial simulations have shown that the feasibility is possible: our team's recent internal report shows that combining the DeepSeek large model with the DIKWP*DIKWP framework can significantly improve the efficiency of semantic interaction with multi-source heterogeneous data. This bodes well for the use of large models for semantic decoding of brain signals. More specifically, the team has a cooperation basis with a hospital rehabilitation center to jointly carry out electrical stimulation and rehabilitation research on spinal cord-peripheral nerve interface, and proposed a multi-parameter neuromodulation strategy integrating the DIKWP model. This project involves the electrical stimulation treatment of paraplegic patients, and the problems that need to be solved, such as cross-scale signal processing and closed-loop control, have similarities with the brain control system of this project. Therefore, we can share the research resources and results, such as neural signal processing algorithm modules, rehabilitation assessment methods, etc., to accelerate the progress of this project.
Resource conditions: The supporting unit of this project has the support of the national key laboratory platform and can provide the necessary equipment purchase and test sites. For example, the lab has a shielded room for EEG experiments and a robotic motion capture space for exoskeleton testing. In addition, we maintain cooperation and communication with top teams in many related fields: such as the Brain-like Computing Center of Tsinghua University (which provides guidance on multimodal brain signal processing), the Department of Neurosurgery of a tertiary hospital in Beijing (which supports the acquisition of cortical electrode data of some patients for research), and an AI company (which provides the latest large model API and computing power support). These industry-university-research partnerships will support the implementation of the project. In terms of capital and management, the relying unit has rich experience and can ensure the smooth progress of the project. A number of national-level projects undertaken by the team in the past have been completed on schedule and with high quality, forming an efficient organization and coordination mechanism. The project team members have a clear division of labor and work closely together, and will ensure that the research progress is carried out as planned through weekly meetings and stage acceptance.
Risk response: Although this project is challenging, the team has a plan: for the possible brain signal decoding effect is not as expected, we have prepared a variety of alternative algorithms and fusion schemes (such as adding electromyography signals as an aid to improve the recognition rate of motor purpose); For peripheral control security risks, we will arrange safety officers and emergency stop mechanisms in the experiment, and strictly limit the scenarios to gradually increase the complexity to ensure that no accidents occur. In addition, we will introduce rolling evaluations during project execution to adjust research priorities based on interim results to minimize risks.
In summary, human-machine-consciousness integration is an emerging frontier direction, and Professor Yucong Duan's team has made long-term accumulation in this direction, bringing together the advantages of theory, technology and data, and has excellent feasibility in implementing this project. A strong interdisciplinary lineup and early results give us the confidence to achieve the set goals and make breakthroughs within the project cycle.
5. Phased results and assessment indicators
The project is carried out in three phases, each stage produces corresponding results and meets the assessment requirements.
- Phase 1 (Project Year 1): Cranial Neural Coding Models and Datasets. Achievements include:
(1) High-quality dataset of multimodal brain signals: covering the language and motor imagination experimental data of at least 20 subjects, each subject collects ≥ 2 modal signals (such as 64-lead EEG + functional ultrasound), and the cumulative data duration is > 50 hours, and it is open after completing preprocessing and labeling (or publishing a data description paper).
(2) Fine neural coding model: The motor/language neural coding model based on DIKWP theory has a >15% higher decoding accuracy than the traditional method after cross-validation.
(3) Research report**/dissertation**: Submit at least one paper to elaborate on the new discovery of the cranial neural coding mechanism (such as discovering the correspondence between a specific frequency band of a brain region and a certain cognitive semantics). Assessment indicators: dataset integrity and quality; the performance improvement of the coding model; Publication or acceptance of papers.
- Phase 2 (Project Year 2): Multimodal AI Decoding Algorithm and Consciousness Fusion Integration. The achievements include: (1) Brain signal decoding algorithm library: realize the algorithm library including deep learning decoding, LLM fusion, purpose inference and other modules, and support offline analysis and real-time operation. The algorithm achieves the expected performance on the experimental data, such as the BLEU value of brain-text decoding > 0.4 in the limited context, and the accuracy of brain-motor decoding classification is > 80% (the specific indicators are refined according to the results of stage 1).
(2) Decoding demonstration of artificial consciousness fusion: build a simulation platform to demonstrate the decoding process guided by artificial consciousness on the computer, for example, when the decoding module is uncertain, the Purpose module successfully guides it to call the backup strategy, which improves the overall accuracy.
(3) Phased papers: Write 1 paper on multimodal decoding methods and strive for publication in international conferences. Assessment indicators: whether the performance of the decoding algorithm meets the milestone requirements (specific values such as accuracy and latency); Whether the functional verification of the artificial consciousness module is successful; Whether the paper or patent output has been achieved.
- Phase 3 (Project Year 3): High-precision brain control system integration and application verification. Achievements include:
(1) Prototype of brain control system: realize a complete closed-loop system, including signal acquisition hardware, decoding software, control interface and joint debugging of at least 3 kinds of equipment (robotic arm, prosthetic hand, wheelchair, etc.). The system has been experimentally verified to allow healthy volunteers to use the brain-controlled system to complete specific tasks (such as mind-controlled robotic arm pick-up, brain-controlled wheelchair navigation simple routes, etc.).
(2) Evaluation report: record the performance data of the system under typical tasks in detail: such as task completion time, success rate, user subjective load evaluation, etc., and compare it with similar international achievements to prove our advantages in accuracy or function. Particular emphasis is placed on the implementation of real-time decoding equipment for high-channel-count brain signals and the effectiveness of the DIKWP-guided intelligent brain control system, such as the number of automatic calibration of the decoding module according to the prompts of the DIKWP evaluation module in the closed-loop < 1 time per day, which is significantly reduced compared with the conventional system.
(3) Application module and demonstration: develop a semantic alignment algorithm module embedded in the system, so that it has preliminary brain semantic communication functions (such as simple text brain typing); Develop an artificial consciousness-driven EEG interaction module to improve the efficiency of user purpose expression in specific scenarios (for example, the system automatically increases sensitivity when the user is focused). These modules form functional components that can be handed over to a healthcare facility or enterprise.
(4) Achievement condensation papers**/patents**: publish 1-2 systematic papers to summarize the system-wide R&D experience and innovation points; Submitted more than 2 invention patents (involving brain-computer interface decoding algorithms, closed-loop control, etc.). *Assessment indicators: *Whether the functions of the brain control system are fully realized, and whether the indicators meet the requirements of the task book (such as success rate, delay, and number of adapted devices); whether the demonstration application has achieved the desired effect (e.g., a paralyzed person is able to operate a prosthesis to complete daily movements in a clinical trial); Whether the IP output is up to standard.
In addition, during the implementation of the project, milestone review and other methods will be used to ensure that the stage goals are achieved. For example, the mid-term inspection requires the demonstration of at least one brain-controlled device operation, and the acceptance requires multi-device scenario demonstration and third-party test verification. The expected phased results of this project are not only a response to the assessment indicators, but also will lay the foundation for a new generation of brain-computer interface technology system in China: including high-channel number decoding hardware, a software platform that integrates artificial consciousness, and a complete set of application solutions. These results will be sorted out and reported at the end of the project, and released to the public in an appropriate way (such as holding a press conference or participating in the achievement exhibition organized by the Ministry of Science and Technology).
6. App & Conversion Paths
The research results of this project have broad application prospects and clear transformation paths, which can serve many fields such as rehabilitation medicine, human-computer collaboration, neuroprosthesis, and intelligent interaction, and provide core technical support for the development of related industries.
(1) Rehabilitation medicine: The high-precision brain control system will first play a direct role in rehabilitation medicine. For patients with high paraplegia, ALS, and other patients who have lost their ability to exercise or speak, our technology can provide a new channel of communication and regaining movement. For example, the brain-controlled robotic arm can assist paraplegic patients to complete daily activities such as independent feeding and personal care, greatly improving their ability to take care of themselves. The brain-controlled exoskeleton is expected to help some paraplegic patients regain standing and walk and achieve basic walking function training (preliminary practice at the University of Grenoble in France has proved the feasibility of quadriplegic patients walking with the help of the brain-controlled exoskeleton). Our system achieves similar functionality in a non-invasive way, which will lower the barrier to entry for clinical application and benefit more patients. At the same time, brain-language decoding technology can be used to construct ideotyping or ideographic speaking devices, allowing patients with aphasia or locked-in syndrome to "speak" their thoughts only through electrical activity in their brains. In this regard, Meta and the University of Texas have achieved early results, and our further Chinese semantic decoding is expected to be applied to communication aids for domestic patients. At the end of the project, we plan to jointly carry out a trial evaluation of a small sample of patients with partner hospitals, and provide prototypes such as brain-controlled prosthetic hands and brain-controlled input methods to target patients for trial for 1-2 weeks to collect clinical feedback. If the results are good, we will promote the medical device registration process and move towards commercialization. In terms of rehabilitation center applications, our high-channel brain-computer signal acquisition equipment and decoding software can upgrade the existing brain-computer training system, improve the accuracy and fun of training, and form a commercial rehabilitation training system (for example, for stroke patients' limb function reconstruction training, allowing patients to practice repeatedly through brain-controlled games). We expect that on the basis of the project results, we can incubate brain-controlled rehabilitation assistive products within 2-3 years, and cooperate with professional medical device companies to promote them to the market to meet the huge rehabilitation needs in China.
**(2) Human-machine collaborative operation: The technology of this project can also be applied to human-machine collaborative operation in industry, service and other fields to improve efficiency and safety. For example, in hazardous environments (mines, firefighting, etc.), workers can remotely control robots or drones to complete tasks such as detection and operation without being in danger by wearing brain-computer interface devices, so as to ensure personal safety and improve the success rate of tasks. In high-precision manufacturing, experienced technicians can complete some ultra-precision assembly or operation through brain-controlled manipulators or collaborative robots - direct brain control can break through the physical limitations of traditional hand-eye coordination and achieve faster and more stable (because direct brain-computer communication is faster than hand movements, and AI-assisted can filter out human hand shaking). Even in the aerospace field, brain-computer interfaces allow terrestrial experts to remotely control robotic assistants on the space station through their minds, or allow astronauts to use their brains to control multiple robotic arms to work simultaneously while they are moving outside the cabin. Although these applications are still challenging from actual implementation, our research will provide the core module: highly reliable brain purpose decoding + multi-machine control. The human-machine cognitive alignment in the results of this project will also be used to improve the efficiency of human-machine collaboration—**AI can better predict human purpose and actively cooperate. For example, the Wisdom construction machinery of the future can automatically adjust the pace of work or take over some control when reading the operator's EEG stress or fatigue signals to avoid accidents. This purpose-aligned human-machine co-driving technology is very promising in the fields of intelligent driving and flight, and some automakers are already researching brainwave monitoring of the driver's status, and our more active human-computer interaction results will become a new technological growth point.
(3) Neuroprostheses and assistive devices: For amputees and people with disabilities, our brain-controlled high-freedom limb hands and prosthetic legs can greatly improve the quality of life. Traditional prostheses are mostly controlled by electromyography, which is expensive to learn and can only perform limited movements. Brain-controlled prostheses can be driven by brain ideas more naturally, and our AI prostheses can understand the user's purpose, such as grasping an egg with appropriate force without crushing it AI can be controlled with the assistance of visual sensing). The prosthetic control technology developed by the project can be integrated into the next generation of intelligent prosthetic products in cooperation with domestic prosthetic manufacturers. In addition, our brain-computer interface caps, consciousness communication devices, etc. can also enter the elderly care and care market as smart assistive devices. For example, brain-controlled smart home: the elderly with impaired mobility or aphasia can control the lights and electrical switches at home by wearing a brain computer headband, or call and communicate with caregivers through their minds, which is very convenient. In the later stage of the project, our team will connect with nursing homes and rehabilitation centers, try brain-controlled smart beds, brain-controlled wheelchairs and other equipment, collect demand feedback, and prepare for the finalization of the product. By co-building demonstration projects with industry partners, such as the Wisdom Home pilot, we will accelerate technology maturity and public acceptance.
(4) New human-computer intelligent interaction: The human brain-AI alignment channel of this project provides the possibility of direct brain interaction in the future. For example, combined with AR/VR technology, users can use their minds to manipulate avatars and communicate with virtual characters in virtual environments such as the metaverse by wearing BCI devices , achieving true "exchange of ideas". The semantic decoding module and alignment algorithm in our results can be used to develop application software such as brain-controlled input method and mind chat to meet the novel needs of the public in entertainment and communication. Imagine a video game in the future where players can control character movements and even release skills using only brain waves without a controller, which will be a new immersive experience. At present, there are preliminary explorations of brain-computer games, but most of them are limited to simple brainwave intensity control. Our high-precision decoding will make brain-computer games have the possibility of complex operations, and make a new blue ocean appear in the game industry. For example, in terms of text input, brain typing will greatly facilitate people with disabilities, and may also develop into a new trend of input methods for able-bodied people (especially when brain-computer devices are light enough). We will make the decoding algorithm API available to developers in the field of human-computer interaction, and give rise to more application scenarios.
Conversion path: In order to promote the implementation of the above applications, we have developed a clear conversion plan. First of all, in the process of project implementation, we will pay attention to the patent layout, and apply for patents in a timely manner for technologies with industrial value, such as "brain-computer interface decoding method based on artificial consciousness", "brain signal-semantic alignment method", "multi-device brain control security mechanism", etc., so as to lay an intellectual property foundation for future business cooperation. At the end of the project, it is planned to condense a set of software and hardware system solutions, including EEG acquisition hardware, AI decoding and awareness module software, application layer interface, etc., to form a prototype that can be commercialized. Next, we will seek cooperation with medical rehabilitation equipment manufacturers and high-tech companies to promote industrialization in the form of industry-university-research cooperation. For example, a rehabilitation device company that has been initially approached is very interested in our brain-controlled prosthetics technology and hopes to jointly develop smart prosthetics and apply for medical device certification; A start-up with brain-computer hardware mass production capabilities, such as a company like BrainCo, has expressed an interest in partnering to embed our decoding algorithms into its headband products to improve competitiveness. Our team is also considering incubating start-up companies independently, focusing on the development and service of brain-computer interface software platforms, and exporting our brain decoding capabilities to various industries in the form of software licensing or algorithm cloud services.
Standards and regulation: As an emerging industry, brain-computer interfaces also need standards and specifications. As a member of the relevant international standards committee, Professor Yucong Duan will take the lead in formulating group standards or guidelines for artificial consciousness brain-computer interface on the basis of the project, so as to ensure technical safety and controllability, ethical compliance, and escort the healthy development of the industry. At the same time, we pay close attention to regulatory developments, actively communicate with the national drug regulatory department about the review requirements of brain-computer interface medical products, and reserve security measures (such as data encryption and permission control) in the technical implementation to add points for future product approval.
In short, the transformation of the results of this project will be steadily promoted by the path of "scientific research demonstration - clinical pilot - product incubation - standard guidance". In the medium-term outlook of about 5 years, we strive to put a number of brain-controlled rehabilitation assistive devices into clinical use, and create a leading brain-computer interface brand or product line in China. In the longer term, the deep integration technology of human brain and artificial intelligence explored in this project is expected to give birth to a new concept of "intelligence augmenting people", and provide core technical reserves for the country to occupy a strategic commanding heights in the field of brain science and brain-like intelligence integration. We believe that through the implementation of this project, it will not only produce a series of academic innovations and engineering breakthroughs, but also directly promote the brain-computer interface from the laboratory to the clinic and industry, and make positive contributions to the construction of a healthy China and an intelligent society.