Academic Homepage · Visual Representation

面向视觉理解与多模态学习的科研主页 Research in Visual Understanding and Multimodal Learning

我关注计算机视觉、深度学习与多模态大模型，长期记录论文、实验和工程实践，把研究问题转化为可复现的系统。 I focus on computer vision, deep learning, and multimodal models, turning research questions into reproducible systems through papers, experiments, and engineering practice.

查看论文 View Publications 返回博客 Visit Blog

Focus Computer Vision

Methods Deep Learning · Multimodal AI

Output 2 Publications · Engineering Notes

01 关于我 About Me

02 精选项目 Selected Projects

Research Workflow

学术工作流助手 Academic Submission Assistant

面向研究生和科研作者，把选刊、前沿跟踪和 AI 预审收进一个公开可用的工作台。 A public research workflow desk for venue discovery, frontier tracking, and AI paper precheck.

进入项目 Open Project Mobile Agent

MeetingMind MeetingMind

面向 Android 移动办公的会议智能体，串联录音转写、纪要、待办和 PPT 生成。 An Android meeting agent that connects transcription, minutes, action items, and deck generation.

进入项目 Open Project Creator Tool

Redflow XHS Redflow XHS

Markdown 优先的小红书卡片渲染与内容发布工作流，用 skill 化方式沉淀创作流程。 A Markdown-first Xiaohongshu card renderer and publishing workflow for repeatable content creation.

查看 GitHub View GitHub Vision Tool

工地检测 Construction Detection

面向施工现场图片的安全隐患检测助手，聚焦违规行为与风险区域识别。 A safety inspection assistant for construction-site imagery, focused on risky behavior and hazard-region detection.

查看项目 View Project Agent Interface

Codex Remote PWA Codex Remote PWA

面向本地 Codex 会话的移动端控制台，把长任务观察、会话切换和轻量操作放到手机上。 A mobile-first console for observing local Codex sessions, switching threads, and handling lightweight actions.

进入项目 Open Project

面向视觉理解与多模态学习的科研主页 Research in Visual Understanding and Multimodal Learning

01 关于我 About Me

技能栈 Tech Stack

02 精选项目 Selected Projects

学术工作流助手 Academic Submission Assistant

MeetingMind MeetingMind

Redflow XHS Redflow XHS

工地检测 Construction Detection

Codex Remote PWA Codex Remote PWA

03 最新动态 Latest News

04 发表论文 Publications

05 荣誉奖项 Honors & Awards