Academic Homepage · Visual Representation
面向视觉理解与多模态学习的科研主页 Research in Visual Understanding and Multimodal Learning
我关注计算机视觉、深度学习与多模态大模型,长期记录论文、实验和工程实践,把研究问题转化为可复现的系统。 I focus on computer vision, deep learning, and multimodal models, turning research questions into reproducible systems through papers, experiments, and engineering practice.
01 关于我 About Me
02 精选项目 Selected Projects
学术工作流助手 Academic Submission Assistant
面向研究生和科研作者,把选刊、前沿跟踪和 AI 预审收进一个公开可用的工作台。 A public research workflow desk for venue discovery, frontier tracking, and AI paper precheck.
进入项目 Open Project Mobile AgentMeetingMind MeetingMind
面向 Android 移动办公的会议智能体,串联录音转写、纪要、待办和 PPT 生成。 An Android meeting agent that connects transcription, minutes, action items, and deck generation.
进入项目 Open Project Creator ToolRedflow XHS Redflow XHS
Markdown 优先的小红书卡片渲染与内容发布工作流,用 skill 化方式沉淀创作流程。 A Markdown-first Xiaohongshu card renderer and publishing workflow for repeatable content creation.
查看 GitHub View GitHub Vision Tool工地检测 Construction Detection
面向施工现场图片的安全隐患检测助手,聚焦违规行为与风险区域识别。 A safety inspection assistant for construction-site imagery, focused on risky behavior and hazard-region detection.
查看项目 View Project Agent InterfaceCodex Remote PWA Codex Remote PWA
面向本地 Codex 会话的移动端控制台,把长任务观察、会话切换和轻量操作放到手机上。 A mobile-first console for observing local Codex sessions, switching threads, and handling lightweight actions.
进入项目 Open Project