<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://chunzhuo.github.io/feed.xml" rel="self" type="application/atom+xml"/><link href="https://chunzhuo.github.io/" rel="alternate" type="text/html" hreflang="en"/><updated>2026-05-12T15:35:17+00:00</updated><id>https://chunzhuo.github.io/feed.xml</id><title type="html">Chunzhuo Zhang</title><subtitle>Researcher working at the intersection of AI4Bio, bioinformatics, and machine learning. </subtitle><entry xml:lang="en"><title type="html">Single-cell Perturb-seq CRISPRi</title><link href="https://chunzhuo.github.io/blog/2026/perturb-seq-crispri/" rel="alternate" type="text/html" title="Single-cell Perturb-seq CRISPRi"/><published>2026-05-11T00:00:00+00:00</published><updated>2026-05-11T00:00:00+00:00</updated><id>https://chunzhuo.github.io/blog/2026/perturb-seq-crispri</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2026/perturb-seq-crispri/"><![CDATA[<p>CRISPRi is a useful perturbation because it behaves like a dimmer switch: the guide RNA brings a catalytically inactive Cas9 repressor to a regulatory region, and transcription drops without making a DNA double-strand break. Perturb-seq adds a pooled single-cell readout, so each cell carries both a perturbation identity and a transcriptome.</p> <p>The interactive view below is one continuous 3D cell, not a sequence of separate plots. Drag the cell to rotate it, scroll to zoom, or use the focus buttons to move from the whole cell into the chromatin zone, the open sgRNA target sequence, and the transcript readout.</p> <div class="perturbseq-crispri-viewer" data-zoom="whole"> <div class="perturbseq-toolbar" aria-label="Perturb-seq CRISPRi controls"> <div class="perturbseq-focus-buttons"> <button type="button" data-focus-target="cell">Whole cell</button> <button type="button" data-focus-target="nucleus">Chromatin</button> <button type="button" data-focus-target="binding">Open target</button> <button type="button" data-focus-target="readout">Readout</button> </div> <div class="perturbseq-zoom-controls"> <button type="button" data-zoom-action="out">-</button> <button type="button" data-zoom-action="in">+</button> <button type="button" data-zoom-action="reset">Reset view</button> <span class="perturbseq-zoom-status">100%</span> </div> </div> <div class="perturbseq-canvas-wrap"> <div class="perturbseq-three-stage" role="img" aria-label="Rotatable 3D cell view of Perturb-seq CRISPRi with organelles and sgRNA target binding"> <canvas class="perturbseq-three-canvas"></canvas> <div class="perturbseq-three-hint">Drag to rotate · Scroll to zoom</div> </div> <svg class="perturbseq-svg" viewBox="0 0 1000 700" role="img" aria-label="Zoomable whole-cell view of Perturb-seq CRISPRi with organelles and sgRNA target binding"> <defs> <marker id="perturbseq-arrow" viewBox="0 0 10 10" refX="9" refY="5" markerWidth="7" markerHeight="7" orient="auto-start-reverse"> <path d="M 0 0 L 10 5 L 0 10 z" fill="currentColor"></path> </marker> </defs> <g class="perturbseq-viewport"> <text class="perturbseq-title" x="34" y="44">Single-cell Perturb-seq CRISPRi</text> <text class="perturbseq-small" x="34" y="66">A single perturbed cell with zoomable organelles, CRISPRi target binding, transcripts, and guide identity.</text> <path class="perturbseq-cell-body" d="M135 348 C132 212 232 120 378 96 C560 65 775 126 857 266 C935 399 867 556 707 618 C554 678 329 636 219 531 C162 476 136 420 135 348 Z"></path> <path class="perturbseq-cytoplasm-texture" d="M222 236 C312 198 380 194 468 218 M678 182 C756 226 796 290 796 360 M232 520 C325 564 464 580 592 558 M716 508 C774 472 814 418 816 348"></path> <g aria-label="nucleus"> <ellipse class="perturbseq-nucleus perturbseq-pulse" cx="455" cy="315" rx="205" ry="150"></ellipse> <ellipse class="perturbseq-nucleolus" cx="374" cy="360" rx="48" ry="35"></ellipse> <path class="perturbseq-dna" d="M306 292 C356 238 430 334 482 283 S596 256 625 323"></path> <path class="perturbseq-dna" d="M302 335 C366 386 421 283 488 346 S590 394 636 338" opacity="0.72"></path> <rect class="perturbseq-target-window perturbseq-pulse" x="430" y="244" width="142" height="72" rx="10"></rect> <text class="perturbseq-label" x="356" y="187">nucleus</text> <text class="perturbseq-small perturbseq-detail-medium" x="315" y="405">nucleolus</text> </g> <g aria-label="CRISPRi binding site"> <path class="perturbseq-dna perturbseq-detail-high" d="M442 278 C468 259 493 302 520 279 S552 268 567 287"></path> <circle class="perturbseq-cas9 perturbseq-pulse" cx="492" cy="282" r="18"></circle> <rect class="perturbseq-krab perturbseq-pulse" x="504" y="252" width="42" height="24" rx="8"></rect> <path class="perturbseq-sgrna perturbseq-pulse" d="M461 294 C474 319 506 320 519 297"></path> <path class="perturbseq-sgrna perturbseq-detail-high" d="M467 309 q8 12 16 0 q8 -12 16 0 q8 12 16 0"></path> <rect class="perturbseq-rnap" x="558" y="305" width="62" height="30" rx="15"></rect> <path class="perturbseq-transcript perturbseq-target-transcript" d="M620 321 C648 322 674 332 698 352"></path> <path class="perturbseq-transcript perturbseq-detail-high" d="M621 321 q10 -10 20 0 q10 10 20 0 q10 -10 20 0"></path> <text class="perturbseq-label perturbseq-detail-medium" x="518" y="238">dCas9-KRAB</text> <text class="perturbseq-small perturbseq-detail-high" x="425" y="333">sgRNA pairs with target sequence</text> <text class="perturbseq-small perturbseq-detail-high" x="586" y="354">reduced nascent transcript</text> </g> <g aria-label="endoplasmic reticulum"> <path class="perturbseq-er" d="M614 262 C705 246 766 278 775 346 C784 415 724 450 646 430"></path> <path class="perturbseq-er" d="M622 298 C695 291 734 316 736 358 C738 400 699 413 648 398" opacity="0.7"></path> <text class="perturbseq-small perturbseq-detail-medium" x="720" y="248">ER</text> </g> <g aria-label="mitochondria"> <ellipse class="perturbseq-mito" cx="255" cy="270" rx="62" ry="28" transform="rotate(-22 255 270)"></ellipse> <path class="perturbseq-detail-medium" d="M214 278 C238 248 258 294 296 260" fill="none" stroke="var(--ps-red)" stroke-linecap="round" stroke-width="2"></path> <ellipse class="perturbseq-mito" cx="730" cy="486" rx="68" ry="30" transform="rotate(18 730 486)"></ellipse> <path class="perturbseq-detail-medium" d="M684 478 C713 513 737 456 779 496" fill="none" stroke="var(--ps-red)" stroke-linecap="round" stroke-width="2"></path> <text class="perturbseq-small perturbseq-detail-medium" x="198" y="224">mitochondrion</text> </g> <g aria-label="golgi and vesicles"> <path class="perturbseq-organelle" d="M286 472 C340 438 392 448 431 488 C380 480 336 488 286 472 Z"></path> <path class="perturbseq-organelle" d="M296 502 C344 482 389 489 422 518 C372 516 332 519 296 502 Z" opacity="0.75"></path> <circle class="perturbseq-organelle" cx="452" cy="517" r="13"></circle> <circle class="perturbseq-organelle" cx="478" cy="496" r="9"></circle> <text class="perturbseq-small perturbseq-detail-medium" x="320" y="548">Golgi / vesicles</text> </g> <g aria-label="transcripts and guide molecules"> <path class="perturbseq-transcript" d="M610 464 C655 445 680 471 715 455"></path> <path class="perturbseq-transcript" d="M535 518 C578 500 622 535 660 509"></path> <path class="perturbseq-transcript perturbseq-target-transcript" d="M608 390 C636 405 660 396 682 418"></path> <circle class="perturbseq-guide-dot" cx="592" cy="494" r="9"></circle> <circle class="perturbseq-guide-dot" cx="629" cy="536" r="7" style="animation-delay: -1.2s;"></circle> <text class="perturbseq-small perturbseq-detail-medium" x="612" y="576">guide identity + transcriptome stay linked to this cell</text> </g> <g aria-label="single-cell capture barcode"> <rect class="perturbseq-callout" x="715" y="84" width="220" height="88" rx="8"></rect> <text class="perturbseq-label" x="734" y="114">Perturb-seq readout</text> <text class="perturbseq-small" x="734" y="138">cell barcode + UMI</text> <text class="perturbseq-small" x="734" y="156">mRNA reads + guide tag</text> <path d="M720 174 C690 235 684 326 680 416" fill="none" stroke="var(--ps-muted)" stroke-dasharray="7 7" stroke-linecap="round" stroke-width="2" marker-end="url(#perturbseq-arrow)"></path> </g> <g class="perturbseq-detail-high" aria-label="zoom labels"> <rect class="perturbseq-callout" x="334" y="116" width="230" height="56" rx="8"></rect> <text class="perturbseq-small" x="350" y="140">Zoom depth reveals the molecular site:</text> <text class="perturbseq-small" x="350" y="158">sgRNA-dCas9-KRAB at target DNA/TSS</text> </g> </g> </svg> </div> <div class="perturbseq-info"> <div class="perturbseq-focus-label">Whole cell</div> <p class="perturbseq-focus-text">One perturbed cell remains in view: membrane, nucleus, organelles, sgRNA cargo, mRNA molecules, and capture barcode are all part of the same scene.</p> </div> </div> <script src="/assets/js/perturbseq-crispri.js?v=ea3af894f2a1b75ef3dad6214393cd07"></script> <h2 id="what-the-experiment-measures">What the experiment measures</h2> <p>The key output is not only whether a target gene went down. The useful object is a table where every row is a single cell, every cell has a guide assignment, and every column is a measured gene. That lets us ask whether perturbing one regulator shifts cells toward another state, suppresses a pathway, changes response to stimulation, or creates a subtle expression program that would be invisible in a bulk assay.</p> <h2 id="why-crispri-fits-this-readout">Why CRISPRi fits this readout</h2> <p>CRISPRi is especially useful when complete knockout is too harsh or when multiple perturbations would create too many DNA breaks. Because it represses transcription through dCas9-KRAB rather than cutting DNA, it can be paired with pooled single-cell screens where the phenotype is a transcriptome, not just growth.</p> <h2 id="minimal-protocol-logic">Minimal protocol logic</h2> <ol> <li>Build or obtain a cell line expressing CRISPRi machinery.</li> <li>Introduce a pooled sgRNA library at controlled multiplicity.</li> <li>Select and culture cells long enough for repression.</li> <li>Capture single cells and prepare transcriptome plus guide libraries.</li> <li>Sequence, assign guides to cells, and quantify expression.</li> <li>Compare each perturbation against controls and visualize response programs.</li> </ol>]]></content><author><name></name></author><category term="research-notes"/><category term="biology"/><category term="single-cell"/><category term="CRISPRi"/><category term="Perturb-seq"/><category term="functional-genomics"/><summary type="html"><![CDATA[An interactive visual explanation of how CRISPRi perturbations are linked to single-cell transcriptomes.]]></summary></entry><entry xml:lang="en"><title type="html">AI Daily Sprouts | 2026-05-10</title><link href="https://chunzhuo.github.io/blog/2026/ai-daily-sprouts-2026-05-10/" rel="alternate" type="text/html" title="AI Daily Sprouts | 2026-05-10"/><published>2026-05-10T00:00:00+00:00</published><updated>2026-05-10T00:00:00+00:00</updated><id>https://chunzhuo.github.io/blog/2026/ai-daily-sprouts</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2026/ai-daily-sprouts-2026-05-10/"><![CDATA[<p>Search date: 2026-05-10. Window used: roughly the last 7 days.</p> <h2 id="top-items">Top items</h2> <h3 id="google-released-gemma-4-multi-token-prediction-drafters">Google released Gemma 4 multi-token prediction drafters</h3> <ul> <li>Date: 2026-05-05</li> <li>Source: <a href="https://blog.google/innovation-and-ai/technology/developers-tools/multi-token-prediction-gemma-4/">Google</a></li> <li>Type: open-model inference release</li> </ul> <p>Google released Multi-Token Prediction drafters for Gemma 4. The drafters use speculative decoding: a smaller draft component predicts several future tokens, then the main model verifies them in parallel. Google reports up to a 3x speedup without degrading output quality or reasoning logic.</p> <p>Why it matters: this targets a practical bottleneck for local, edge, and workstation LLM deployment. The bottleneck is often token-by-token latency rather than only raw model capability.</p> <p>Caveat: the speed and quality claims are vendor-reported and hardware-dependent; independent deployment measurements will matter.</p> <h3 id="microsoft-framed-frontier-firms-around-human-agent-operating-models">Microsoft framed “Frontier Firms” around human-agent operating models</h3> <ul> <li>Date: 2026-05-05</li> <li>Source: <a href="https://blogs.microsoft.com/blog/2026/05/05/how-frontier-firms-are-rebuilding-the-operating-model-for-the-age-of-ai/">Microsoft</a></li> <li>Type: enterprise AI / agent workflow update</li> </ul> <p>Microsoft described a progression from authoring with AI to editing, directing, and orchestrating AI agents, and tied that model to expanded Copilot Cowork capabilities. The operating-model framing is useful because it moves the conversation from “does an assistant help?” to “how do governed agents run work across systems?”</p> <p>Caveat: this is an official product and strategy narrative, not an independent productivity study.</p> <h2 id="recent-papers">Recent papers</h2> <h3 id="llms-improving-llms-agentic-discovery-for-test-time-scaling">LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling</h3> <ul> <li>Date: 2026-05-08</li> <li>Source: <a href="https://arxiv.org/abs/2605.08083">arXiv:2605.08083</a></li> <li>Type: preprint</li> </ul> <p>AutoTTS reframes test-time scaling as a controller-synthesis problem. Instead of manually choosing when a model should branch, continue, probe, prune, or stop, the method searches over inference policies using pre-collected reasoning trajectories and probe signals.</p> <p>The authors report better accuracy-cost tradeoffs on math reasoning benchmarks, generalization to held-out benchmarks and model scales, and a discovery cost of about $39.90 and 160 minutes. The practical caveat is that this is still a new preprint; the promised code release should be checked before treating it as deployable infrastructure.</p> <h3 id="fast-byte-latent-transformer">Fast Byte Latent Transformer</h3> <ul> <li>Date: 2026-05-08</li> <li>Source: <a href="https://arxiv.org/abs/2605.08044">arXiv:2605.08044</a></li> <li>Type: preprint</li> </ul> <p>Byte-level language models avoid fixed subword vocabularies, but byte-by-byte decoding is slow. This paper introduces BLT Diffusion, BLT Self-speculation, and BLT Diffusion+Verification so byte-level models can generate multiple bytes per step or verify drafted bytes efficiently.</p> <p>The authors report that the approaches can reduce estimated memory-bandwidth cost by more than 50% on generation tasks. The next test is whether these methods hold up in real serving stacks and downstream applications.</p> <h3 id="veccisc-improving-confidence-informed-self-consistency-with-reasoning-trace-clustering-and-candidate-answer-selection">VecCISC: Improving Confidence-Informed Self-Consistency with Reasoning Trace Clustering and Candidate Answer Selection</h3> <ul> <li>Date: 2026-05-08</li> <li>Source: <a href="https://arxiv.org/abs/2605.08070">arXiv:2605.08070</a></li> <li>Type: ACL 2026 Findings paper</li> </ul> <p>Confidence-weighted self-consistency can improve reasoning, but it is expensive when a critic model must score every sampled reasoning trace. VecCISC reduces that cost by clustering and filtering traces that are semantically equivalent, degenerate, or hallucinated before calling the critic.</p> <p>The paper reports a 47% token reduction while maintaining or exceeding CISC accuracy across math, chemistry, biology, commonsense, and humanities datasets. The main caveat is domain transfer: trace similarity and critic behavior can vary sharply by task.</p> <h3 id="scope-structured-decomposition-and-conditional-skill-orchestration-for-complex-image-generation">SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation</h3> <ul> <li>Date: 2026-05-08</li> <li>Source: <a href="https://arxiv.org/abs/2605.08043">arXiv:2605.08043</a></li> <li>Type: preprint</li> </ul> <p>SCOPE attacks a familiar image-generation failure mode: complex prompts contain many visual commitments, and systems can lose track of them across grounding, generation, and verification. The method keeps those commitments in an evolving structured specification, then conditionally invokes retrieval, reasoning, and repair skills.</p> <p>The paper introduces Gen-Arena and reports stronger commitment-level intent realization than evaluated baselines, including 0.60 EGIP on Gen-Arena. The broader significance depends on whether the benchmark and metric gain independent use.</p> <h3 id="beyond-pairs-your-language-model-is-secretly-optimizing-a-preference-graph">Beyond Pairs: Your Language Model is Secretly Optimizing a Preference Graph</h3> <ul> <li>Date: 2026-05-08</li> <li>Source: <a href="https://arxiv.org/abs/2605.08037">arXiv:2605.08037</a></li> <li>Type: preprint</li> </ul> <p>GraphDPO argues that pairwise DPO throws away useful structure when each prompt has multiple ranked rollouts. The method represents ranked responses as a directed acyclic preference graph and optimizes a graph-structured objective while keeping linear per-prompt complexity.</p> <p>The authors report stronger results on reasoning and program-synthesis tasks than pairwise or listwise alternatives. As with most preference-optimization work, robustness will depend heavily on preference-data quality and replication across model families.</p> <h2 id="watch-list">Watch list</h2> <ul> <li>Inference efficiency is the dominant theme today: Gemma 4 drafters, Fast BLT, AutoTTS, and VecCISC all reduce latency, token cost, or search cost rather than only increasing model size.</li> <li>Agent workflows are converging on orchestration: enterprise products and research systems both emphasize delegated subtasks, verification, and repair loops.</li> <li>New evaluation surfaces such as Gen-Arena are worth watching if they become common baselines rather than one-off paper artifacts.</li> </ul>]]></content><author><name></name></author><category term="daily-sprouts"/><category term="AI"/><category term="papers"/><category term="AI-news"/><category term="daily-sprouts"/><summary type="html"><![CDATA[Daily AI research and news digest covering inference efficiency, agent workflows, byte-level LMs, and preference optimization.]]></summary></entry><entry xml:lang="en"><title type="html">AI Daily Sprouts | 2026-05-09</title><link href="https://chunzhuo.github.io/blog/2026/ai-daily-sprouts/" rel="alternate" type="text/html" title="AI Daily Sprouts | 2026-05-09"/><published>2026-05-09T00:00:00+00:00</published><updated>2026-05-09T00:00:00+00:00</updated><id>https://chunzhuo.github.io/blog/2026/ai-daily-sprouts</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2026/ai-daily-sprouts/"><![CDATA[<p>Search date: 2026-05-09. Window used: roughly the last 7-14 days, with one slightly older paper included because it directly relates to agent skill learning.</p> <h2 id="top-items">Top items</h2> <h3 id="openai-released-new-realtime-voice-models-for-the-api">OpenAI released new realtime voice models for the API</h3> <ul> <li>Date: 2026-05-07</li> <li>Source: <a href="https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/">OpenAI</a></li> <li>Type: product release</li> </ul> <p>OpenAI introduced GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper for live voice reasoning, translation, and streaming transcription. Voice agents are moving from turn-taking demos toward tool-using, multilingual, realtime workflows. The 128K context window for GPT-Realtime-2 also makes longer voice sessions more practical.</p> <p>Caveat: the performance claims are vendor-reported; production behavior still depends heavily on latency, tool design, and domain-specific evaluation.</p> <h3 id="openai-made-gpt-55-instant-the-default-chatgpt-model">OpenAI made GPT-5.5 Instant the default ChatGPT model</h3> <ul> <li>Date: 2026-05-05</li> <li>Source: <a href="https://openai.com/index/gpt-5-5-instant/">OpenAI</a></li> <li>Supporting source: <a href="https://openai.com/index/gpt-5-5-instant-system-card/">OpenAI system card</a></li> <li>Type: model release and safety publication</li> </ul> <p>GPT-5.5 Instant became ChatGPT’s default model, with OpenAI reporting fewer hallucinated claims than GPT-5.3 Instant, especially on high-stakes prompts. The main direction is reliability rather than only raw capability: lower hallucination rates, better image/STEM handling, improved search decisions, and more transparent personalization controls.</p> <p>Caveat: the hallucination reductions are from OpenAI’s internal evaluations; independent replication would be useful.</p> <h3 id="google-deepmind-highlighted-alphaevolves-broader-impact">Google DeepMind highlighted AlphaEvolve’s broader impact</h3> <ul> <li>Date: 2026-05-07</li> <li>Source: <a href="https://deepmind.google/blog/alphaevolve-impact/">Google DeepMind</a></li> <li>Type: research and deployment update</li> </ul> <p>DeepMind reported AlphaEvolve applications across genomics, grid optimization, quantum circuits, mathematics, TPU design, storage systems, logistics, ads, and materials/life-science modeling. This is a strong signal that LLM-powered algorithm discovery is becoming operational infrastructure, not just a research demo.</p> <p>Caveat: many claims are application-specific and come from Google or partner deployments; the generality of the approach depends on whether problems have reliable automated evaluators.</p> <h3 id="us-caisi-expanded-frontier-ai-model-testing-agreements">U.S. CAISI expanded frontier AI model testing agreements</h3> <ul> <li>Date: 2026-05-05</li> <li>Source: <a href="https://www.nist.gov/news-events/news/2026/05/caisi-signs-agreements-regarding-frontier-ai-national-security-testing">NIST / CAISI</a></li> <li>Supporting source: <a href="https://blogs.microsoft.com/on-the-issues/2026/05/05/advancing-ai-evaluation-with-the-center-for-ai-standards-us-and-innovation-and-the-ai-security-institute-uk/">Microsoft</a></li> <li>Type: policy / safety governance</li> </ul> <p>CAISI announced agreements with Google DeepMind, Microsoft, and xAI for pre-deployment evaluations and targeted research on frontier AI capabilities and security risks. Frontier model assessment is becoming more formalized, especially for cybersecurity, biosecurity, chemical-risk, and national-security concerns.</p> <p>Caveat: these are collaborative testing agreements, not a full public regulatory regime; details of model access, evaluation criteria, and enforcement remain limited.</p> <h3 id="anthropic-expanded-compute-capacity-and-claude-usage-limits">Anthropic expanded compute capacity and Claude usage limits</h3> <ul> <li>Date: 2026-05-06</li> <li>Source: <a href="https://www.anthropic.com/news/higher-limits-spacex">Anthropic</a></li> <li>Type: infrastructure / product capacity</li> </ul> <p>Anthropic announced a SpaceX compute partnership and higher Claude Code/API usage limits, including doubled five-hour Claude Code limits for several paid plans. Capacity is still a strategic bottleneck for frontier AI products. More compute directly affects developer workflows, API availability, and model deployment scale.</p> <h3 id="anthropic-announced-an-enterprise-ai-services-company">Anthropic announced an enterprise AI services company</h3> <ul> <li>Date: 2026-05-04</li> <li>Source: <a href="https://www.anthropic.com/news/enterprise-ai-services-company">Anthropic</a></li> <li>Type: enterprise AI deployment</li> </ul> <p>Anthropic, Blackstone, Hellman &amp; Friedman, and Goldman Sachs announced a new AI services company focused on helping mid-sized companies deploy Claude in core operations. Frontier labs are moving deeper into implementation services, not only model/API distribution.</p> <h2 id="recent-papers-and-benchmarks">Recent papers and benchmarks</h2> <h3 id="claw-eval-live-a-live-agent-benchmark-for-evolving-real-world-workflows">Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows</h3> <ul> <li>Date: 2026-05-01</li> <li>Source: <a href="https://chatpaper.com/paper/274070">ChatPaper summary</a></li> <li>Type: agent benchmark paper</li> </ul> <p>Static agent benchmarks age quickly and often grade final answers without verifying whether the agent actually executed a workflow. Claw-Eval-Live separates a refreshable signal layer from reproducible, timestamped release snapshots so agent tasks can evolve with real workflow demand.</p> <p>Caveat: I found a secondary paper page during this quick run; for a deeper digest, verify against the arXiv page or project repository.</p> <h3 id="skilllearnbench-benchmarking-continual-learning-methods-for-agent-skill-generation-on-real-world-tasks">SkillLearnBench: Benchmarking Continual Learning Methods for Agent Skill Generation on Real-World Tasks</h3> <ul> <li>Date: 2026-04-22</li> <li>Source: <a href="https://www.emergentmind.com/papers/2604.20087">Emergent Mind paper page</a></li> <li>Type: agent learning benchmark paper</li> </ul> <p>Skills are increasingly used to make agents reliable on complex tasks, but automatically generating and improving those skills is still uneven. This benchmark evaluates continual skill learning across 20 verified tasks and measures skill quality, execution trajectory, and task outcome.</p> <h2 id="watch-list">Watch list</h2> <ul> <li>Voice agents are becoming more tool-oriented and production-shaped.</li> <li>Frontier-model evaluation is shifting toward government-lab collaboration before deployment.</li> <li>Agent benchmarks are increasingly emphasizing live workflows, verification, and changing environments.</li> <li>Algorithm-discovery agents such as AlphaEvolve are moving from research examples into infrastructure and commercial optimization.</li> </ul>]]></content><author><name></name></author><category term="daily-sprouts"/><category term="AI"/><category term="papers"/><category term="AI-news"/><category term="daily-sprouts"/><summary type="html"><![CDATA[Daily AI research and news digest covering model releases, AI agents, frontier-model evaluation, and AI infrastructure.]]></summary></entry><entry xml:lang="en"><title type="html">bioAI Daily Sprouts | 2026-05-09</title><link href="https://chunzhuo.github.io/blog/2026/bioai-daily-sprouts/" rel="alternate" type="text/html" title="bioAI Daily Sprouts | 2026-05-09"/><published>2026-05-09T00:00:00+00:00</published><updated>2026-05-09T00:00:00+00:00</updated><id>https://chunzhuo.github.io/blog/2026/bioai-daily-sprouts</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2026/bioai-daily-sprouts/"><![CDATA[<p>Search date: 2026-05-09. Window: 2026-04-09 to 2026-05-09. Sources prioritized: Nature Biotechnology and Nature Methods publisher pages, with peer-reviewed articles and major reviews favored over news items.</p> <h2 id="papers">Papers</h2> <ol> <li> <p><strong>Digital twins of ex vivo human lungs enable accurate and personalized evaluation of therapeutic efficacy</strong> Nature Biotechnology, 2026-05-04. <a href="https://doi.org/10.1038/s41587-026-03121-4">DOI/link</a> Summary: Builds data-rich human lung digital twins from ex vivo lung perfusion, integrating physiology, imaging, transcriptomics, metabolomics and proteomics to forecast organ behavior and therapeutic response. Why it matters: It shows how organ-scale digital twins can be anchored in prospective human-organ measurements rather than purely retrospective clinical modeling. Tags: digital twins; translational biology; precision medicine; computational biology</p> </li> <li> <p><strong>TxPert: using multiple knowledge graphs for prediction of transcriptomic perturbation effects</strong> Nature Biotechnology, 2026-05-01. <a href="https://doi.org/10.1038/s41587-026-03113-4">DOI/link</a> Summary: Introduces a deep learning framework that combines basal transcriptomic state encoding with multiple biological knowledge graphs to predict out-of-distribution genetic perturbation responses. Why it matters: Perturbation prediction is central to model-guided experiments and drug discovery, and this paper explicitly benchmarks against strong nonlearned baselines and experimental reproducibility. Tags: AI4Bio; perturbation prediction; transcriptomics; knowledge graphs; machine learning</p> </li> <li> <p><strong>DNA-guided CRISPR-Cas12a effectors for programmable RNA recognition and cleavage</strong> Nature Biotechnology, 2026-05-01. <a href="https://doi.org/10.1038/s41587-026-03120-5">DOI/link</a> Summary: Reprograms Cas12a into a DNA-guided, RNA-targeting effector and demonstrates direct RNA detection plus intracellular RNA knockdown. Why it matters: The work expands programmable nucleic-acid engineering beyond canonical RNA-guided CRISPR architectures and creates new design space for RNA diagnostics and manipulation. Tags: CRISPR; RNA; synthetic biology; diagnostics; biotechnology</p> </li> <li> <p><strong>Single-molecule localization and diffusivity microscopy reveals dynamic biomolecular organization in living cells</strong> Nature Methods, 2026-04-28. <a href="https://doi.org/10.1038/s41592-026-03078-x">DOI/link</a> Summary: Presents SMLDM, a deep learning-enabled microscopy method that estimates molecule movement and diffusion from single-frame snapshots without trajectory linking. Why it matters: It sharply increases mapping density for live-cell molecular dynamics, helping connect spatial organization with mobility in chromatin, receptors, adhesions and condensates. Tags: bioimage informatics; deep learning; microscopy; single-molecule biophysics</p> </li> <li> <p><strong>Systematically decoding pathological morphologies and molecular profiles with unified multimodal embedding</strong> Nature Methods, 2026-04-24. <a href="https://doi.org/10.1038/s41592-026-03070-5">DOI/link</a> Summary: Introduces Multi-Embed, an interpretable multimodal framework for linking pathology morphology with multilayer molecular profiles. Why it matters: Computational pathology is moving from image-only predictors toward morphology-to-molecular reasoning that can support mechanistic disease interpretation. Tags: computational pathology; multimodal learning; molecular profiling; machine learning</p> </li> <li> <p><strong>Direct RNA sequencing and signal alignment reveal RNA structure ensembles in a eukaryotic cell</strong> Nature Methods, 2026-04-24. <a href="https://doi.org/10.1038/s41592-026-03069-y">DOI/link</a> Summary: Combines chemical probing, direct RNA sequencing and signal alignment to map RNA structural ensembles at single-molecule resolution in eukaryotic cells. Why it matters: It turns raw direct-sequencing signal into a richer readout of RNA structural heterogeneity, connecting transcript sequence, isoforms and regulatory structure. Tags: RNA structure; direct RNA sequencing; transcriptomics; computational biology</p> </li> <li> <p><strong>High-fidelity intravital imaging of biological dynamics with latent-space-enhanced digital adaptive optics</strong> Nature Biotechnology, 2026-04-23. <a href="https://doi.org/10.1038/s41587-026-03107-2">DOI/link</a> Summary: Develops latent-space-enhanced digital adaptive optics for intravital fluorescence microscopy, using wave-optics priors in spatial-angular data to improve aberration estimation. Why it matters: Better computational correction can make in vivo immune, neural and injury imaging more quantitative without relying only on expensive custom hardware. Tags: bioimage informatics; microscopy; latent representations; computational imaging</p> </li> <li> <p><strong>Orthrus: toward evolutionary and functional RNA foundation models</strong> Nature Methods, 2026-04-17. <a href="https://doi.org/10.1038/s41592-026-03064-3">DOI/link</a> Summary: Builds an RNA foundation-model direction aimed at learning evolutionary and functional representations across RNA sequences. Why it matters: RNA language models are becoming a parallel track to protein language models, with potential utility in RNA biology, functional prediction and therapeutic design. Tags: AI4Bio; RNA; foundation models; sequence modeling; transcriptomics</p> </li> <li> <p><strong>Artificial allosteric protein switches with machine-learning-designed receptors</strong> Nature Biotechnology, 2026-04-15. <a href="https://doi.org/10.1038/s41587-026-03081-9">DOI/link</a> Summary: Shows that machine-learning-designed ligand-binding domains can act as receptors in artificial allosteric protein switches and biosensors. Why it matters: It links generative protein design to working synthetic-biology devices, including logic gates, engineered cells and bioelectronic hormone sensing. Tags: protein design; synthetic biology; biosensors; AI4Bio</p> </li> <li> <p><strong>Inducible, split base editors for in vivo cancer functional genomics</strong> Nature Biotechnology, 2026-04-15. <a href="https://doi.org/10.1038/s41587-026-03077-5">DOI/link</a> Summary: Designs split, inducible base editors for controlled in vivo cancer functional genomics, reducing constraints from constitutively active deaminase systems. Why it matters: More controllable base-editing screens can improve mutation-level functional genomics in animal models and better separate target effects from editor toxicity. Tags: genome editing; base editors; cancer genomics; functional genomics</p> </li> <li> <p><strong>Adaptive optical correction for in vivo two-photon fluorescence microscopy with neural fields</strong> Nature Methods, 2026-04-13. <a href="https://doi.org/10.1038/s41592-026-03053-6">DOI/link</a> Summary: Uses neural fields to perform adaptive optical correction for in vivo two-photon microscopy under motion and sample-induced aberration. Why it matters: Neural representations are becoming useful infrastructure for biological imaging, especially when hardware-only correction is difficult or fragile. Tags: bioimage informatics; neural fields; microscopy; neuroscience; software</p> </li> </ol> <h2 id="watch-list">Watch list</h2> <ul> <li>Perturbation modeling is maturing: papers now spend more space on realistic out-of-distribution tasks, baselines and reproducibility ceilings.</li> <li>RNA-focused foundation models and direct RNA signal analysis are both advancing, suggesting stronger computational tools for RNA function and RNA therapeutics.</li> <li>Bioimage informatics is shifting toward latent representations, neural fields and deep-learning-assisted physical correction rather than segmentation alone.</li> <li>Experimentally grounded AI4Bio remains the strongest signal: the most useful papers combine model advances with organ systems, live-cell imaging, CRISPR tools or protein engineering validation.</li> </ul>]]></content><author><name></name></author><category term="daily-sprouts"/><category term="bioAI"/><category term="AI4Bio"/><category term="bioinformatics"/><category term="papers"/><category term="daily-sprouts"/><summary type="html"><![CDATA[Daily AI4Bio, bioinformatics, and computational biology paper digest.]]></summary></entry><entry xml:lang="en"><title type="html">Multimodality for Biology</title><link href="https://chunzhuo.github.io/blog/2026/multimodality-for-biology/" rel="alternate" type="text/html" title="Multimodality for Biology"/><published>2026-05-07T00:00:00+00:00</published><updated>2026-05-07T00:00:00+00:00</updated><id>https://chunzhuo.github.io/blog/2026/multimodality-for-biology</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2026/multimodality-for-biology/"><![CDATA[<p>In single-cell and broader computational biology, “multimodality” comes in many flavors — DNA sequence, RNA expression, chromatin accessibility, protein levels, perturbation responses, knowledge graphs, text. The hard part is rarely listing the modalities; it is choosing how to fuse them.</p> <p>These are notes from a recent talk where I tried to organize the landscape into three approaches: <strong>bottom-up</strong>, <strong>parallel</strong>, and <strong>uniform</strong>. Each makes a different bet about where biological structure lives and where modalities should meet inside the model.</p> <h2 id="multimodality-tasks">Multimodality tasks</h2> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/posts/multimodality-for-biology/image1-480.webp 480w,/assets/img/posts/multimodality-for-biology/image1-800.webp 800w,/assets/img/posts/multimodality-for-biology/image1-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/posts/multimodality-for-biology/image1.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>Before fixing on an architecture it helps to be explicit about the tasks we want a multimodal biological model to do — cross-modal prediction, perturbation response, cell-state inference, sequence-to-function, and so on. Different tasks pull architecture in different directions, and the rest of this post only makes sense relative to what we are asking the model to predict.</p> <h2 id="bottom-up-approach">Bottom-up approach</h2> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/posts/multimodality-for-biology/image2-480.webp 480w,/assets/img/posts/multimodality-for-biology/image2-800.webp 800w,/assets/img/posts/multimodality-for-biology/image2-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/posts/multimodality-for-biology/image2.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/posts/multimodality-for-biology/image3-480.webp 480w,/assets/img/posts/multimodality-for-biology/image3-800.webp 800w,/assets/img/posts/multimodality-for-biology/image3-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/posts/multimodality-for-biology/image3.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>The bottom-up approach builds representations along the natural hierarchy of biology: <strong>molecular → cellular → multicellular</strong>. UCE-style models learn cell embeddings from gene-level tokens; models like PULSAR push further toward tissue- and multicellular-level structure. Each tier is trained on what is plentiful at that scale, and the next tier inherits its substrate from below.</p> <p>The advantage is that each level is interpretable on its own terms and can be pretrained independently. The cost is that errors and biases compound as you climb the hierarchy.</p> <h3 id="from-sequence-to-perturbation">From sequence to perturbation</h3> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/posts/multimodality-for-biology/image4-480.webp 480w,/assets/img/posts/multimodality-for-biology/image4-800.webp 800w,/assets/img/posts/multimodality-for-biology/image4-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/posts/multimodality-for-biology/image4.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>A concrete instance of the bottom-up program: start from genomic sequence and train representations that transfer downstream to perturbation prediction. The chain is <em>sequence → expression → response</em>, and the architectural question is at which level multimodal signals should enter.</p> <h2 id="parallel-approach">Parallel approach</h2> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/posts/multimodality-for-biology/image5-480.webp 480w,/assets/img/posts/multimodality-for-biology/image5-800.webp 800w,/assets/img/posts/multimodality-for-biology/image5-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/posts/multimodality-for-biology/image5.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>The parallel approach treats modalities as roughly co-equal and combines per-modality embeddings at the input. A canonical case: take a DNA sequence and seven epigenetic tracks, embed each independently, and <strong>directly sum the eight embeddings</strong>. Everything downstream sees a single fused vector.</p> <p>This is cheap, easy to scale modality-by-modality, and trivial to extend with a new track. The price is that direct summation assumes all modalities live in the same metric space — which is rarely true biologically.</p> <h3 id="separate-encoder-per-modality">Separate encoder per modality</h3> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/posts/multimodality-for-biology/image6-480.webp 480w,/assets/img/posts/multimodality-for-biology/image6-800.webp 800w,/assets/img/posts/multimodality-for-biology/image6-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/posts/multimodality-for-biology/image6.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>A more careful variant: keep one encoder per modality and fuse later. Each encoder can use whatever tokenization and inductive bias suits its data type, and fusion happens through concatenation, cross-attention, or gating — no longer at the input.</p> <h3 id="different-knowledge-sources">Different knowledge sources</h3> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/posts/multimodality-for-biology/image7-480.webp 480w,/assets/img/posts/multimodality-for-biology/image7-800.webp 800w,/assets/img/posts/multimodality-for-biology/image7-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/posts/multimodality-for-biology/image7.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>Beyond raw signals, multimodal can mean fusing different <em>kinds</em> of knowledge: an LLM for textual context, a knowledge graph for curated relations, tabular features for engineered priors. Two pooling strategies show up repeatedly:</p> <ul> <li><strong>Global pooling</strong> — a weighted average of source embeddings.</li> <li><strong>Attention-based pooling</strong> — let the query decide which source matters.</li> </ul> <p>The latter usually wins when the relevance of each source varies across examples.</p> <h2 id="uniform-approach">Uniform approach</h2> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/posts/multimodality-for-biology/image8-480.webp 480w,/assets/img/posts/multimodality-for-biology/image8-800.webp 800w,/assets/img/posts/multimodality-for-biology/image8-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/posts/multimodality-for-biology/image8.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>The uniform approach goes the other direction from per-modality encoders: serialize multiple sequences into a single stream and let one model digest them all. Sequence-related tasks (DNA, RNA, protein) are a natural fit — they already share a token-stream shape.</p> <p>The simplicity is appealing — one model, one loss, no fusion module. The hard part is teaching a single model to respect the very different statistics of, say, codon usage versus regulatory motifs.</p> <h2 id="relational-transformer-for-biology">Relational transformer for biology</h2> <div class="row mt-3"> <div class="col-sm mt-3 mt-md-0"> <figure> <picture> <source class="responsive-img-srcset" srcset="/assets/img/posts/multimodality-for-biology/image9-480.webp 480w,/assets/img/posts/multimodality-for-biology/image9-800.webp 800w,/assets/img/posts/multimodality-for-biology/image9-1400.webp 1400w," type="image/webp" sizes="95vw"/> <img src="/assets/img/posts/multimodality-for-biology/image9.png" class="img-fluid rounded z-depth-1" width="100%" height="auto" data-zoomable="" loading="eager" onerror="this.onerror=null; $('.responsive-img-srcset').remove();"/> </picture> </figure> </div> </div> <p>The architecture I am most interested in is a <strong>relational transformer</strong>: instead of forcing modalities through a single fusion bottleneck, represent biological entities (genes, cells, regions) as nodes and let attention range over typed relations between them.</p> <h3 id="details">Details</h3> <p>Two attention patterns carry most of the weight:</p> <ul> <li><strong>Relational attention</strong> — for <em>complementary</em> modalities, where each modality contributes information the others do not. Attention selects across modalities at each layer.</li> <li><strong>Hierarchical attention</strong> — for <em>hierarchical</em> modalities, where the structure itself is nested (region → gene → cell → tissue). Attention is constrained by that hierarchy.</li> </ul> <p>Two open problems I keep running into:</p> <ul> <li><strong>Memory constraint.</strong> Cross-modality attention is quadratic in token count, and biological inputs are long.</li> <li><strong>Coupling-data constraint.</strong> Training relational attention requires examples where modalities are observed together, and truly paired multimodal datasets at scale are still rare.</li> </ul> <p>These are the bottlenecks I think the next round of work — mine and others’ — needs to address.</p> <h2 id="references">References</h2> <ol> <li><span id="liang2024foundations">Liang, P. P., Zadeh, A., &amp; Morency, L.-P. (2024). Foundations &amp; Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions. <i>ACM Computing Surveys</i>, <i>56</i>(10). https://doi.org/10.1145/3656580</span></li> <li><span id="rosen2023uce">Rosen, Y., Roohani, Y., Agrawal, A., Samotorcan, L., Quake, S. R., &amp; Leskovec, J. (2023). Universal Cell Embeddings: A Foundation Model for Cell Biology. <i>BioRxiv</i>. https://doi.org/10.1101/2023.11.28.568918</span></li> <li><span id="pang2025pulsar">Pang, K., Rosen, Y., Kedzierska, K., He, Z., Rajagopal, A., Gustafson, C. E., Huynh, G., &amp; Leskovec, J. (2025). PULSAR: a Foundation Model for Multi-scale and Multicellular Biology. <i>BioRxiv</i>. https://doi.org/10.1101/2025.11.24.685470</span></li> <li><span id="fu2026strand">Fu, B., Dasoulas, G., Gabbita, S., Lin, X., Gao, S., Su, X., Ghosh, S., &amp; Zitnik, M. (2026). STRAND: Sequence-Conditioned Transport for Single-Cell Perturbations. <i>ArXiv Preprint ArXiv:2602.10156</i>. https://arxiv.org/abs/2602.10156</span></li> <li><span id="yang2024multimodal">Yang, Z., Fan, X., Lan, M., Tang, X., Zheng, Z., Liu, B., You, Y., Tian, L., Church, G., Liu, X., &amp; Gu, F. (2024). Multimodal foundation model predicts zero-shot functional perturbations and cell fate dynamics. <i>BioRxiv</i>. https://doi.org/10.1101/2024.12.19.629561</span></li> <li><span id="yang2023genecompass">Yang, X., Liu, G., Feng, G., Bu, D., Wang, P., &amp; others. (2023). GeneCompass: Deciphering Universal Gene Regulatory Mechanisms with Knowledge-Informed Cross-Species Foundation Model. <i>BioRxiv</i>. https://doi.org/10.1101/2023.09.26.559542</span></li> <li><span id="littman2025presage">Littman, R., Levine, J., Maleki, S., Lee, Y., Ermakov, V., Qiu, L., Wu, A., Huang, K., Lopez, R., Scalia, G., Biancalani, T., Richmond, D., Regev, A., &amp; Hütter, J.-C. (2025). Gene-embedding-based prediction and functional evaluation of perturbation expression responses with PRESAGE. <i>BioRxiv</i>. https://doi.org/10.1101/2025.06.03.657653</span></li> <li><span id="golkar2026mimic">Golkar, S., Kovalic, J., Espejo Morales, I., Sledzieski, S., Cho, K., Cranmer, M., Ho, S., &amp; others. (2026). MIMIC: A Generative Multimodal Foundation Model for Biomolecules. <i>ArXiv Preprint ArXiv:2604.24506</i>. https://arxiv.org/abs/2604.24506</span></li> <li><span id="ranjan2025relational">Ranjan, R., Hudovernik, V., Znidar, M., Kanatsoulis, C., Upendra, R., Mohammadi, M., Meyer, J., Palczewski, T., Guestrin, C., &amp; Leskovec, J. (2025). Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data. <i>ArXiv Preprint ArXiv:2510.06377</i>. https://arxiv.org/abs/2510.06377</span></li> </ol>]]></content><author><name></name></author><category term="research-notes"/><category term="machine learning"/><category term="biology"/><category term="multimodality"/><category term="single-cell"/><category term="foundation-models"/><summary type="html"><![CDATA[Three approaches — bottom-up, parallel, and uniform — for fusing biological modalities, and where I think the field should go.]]></summary></entry><entry xml:lang="en"><title type="html">a post with plotly.js</title><link href="https://chunzhuo.github.io/blog/2025/plotly/" rel="alternate" type="text/html" title="a post with plotly.js"/><published>2025-03-26T14:24:00+00:00</published><updated>2025-03-26T14:24:00+00:00</updated><id>https://chunzhuo.github.io/blog/2025/plotly</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2025/plotly/"><![CDATA[<p>This is an example post with some <a href="https://plotly.com/javascript/">plotly</a> code.</p> <div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">```</span><span class="nl">plotly
</span><span class="sb">{
  "data": [
    {
      "x": [1, 2, 3, 4],
      "y": [10, 15, 13, 17],
      "type": "scatter"
    },
    {
      "x": [1, 2, 3, 4],
      "y": [16, 5, 11, 9],
      "type": "scatter"
    }
  ]
}</span>
<span class="p">```</span>
</code></pre></div></div> <p>Which generates:</p> <pre><code class="language-plotly">{
  "data": [
    {
      "x": [1, 2, 3, 4],
      "y": [10, 15, 13, 17],
      "type": "scatter"
    },
    {
      "x": [1, 2, 3, 4],
      "y": [16, 5, 11, 9],
      "type": "scatter"
    }
  ]
}
</code></pre> <p>Also another example chart.</p> <div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">```</span><span class="nl">plotly
</span><span class="sb">{
  "data": [
    {
      "x": [1, 2, 3, 4],
      "y": [10, 15, 13, 17],
      "mode": "markers"
    },
    {
      "x": [2, 3, 4, 5],
      "y": [16, 5, 11, 9],
      "mode": "lines"
    },
    {
      "x": [1, 2, 3, 4],
      "y": [12, 9, 15, 12],
      "mode": "lines+markers"
    }
  ],
  "layout": {
    "title": {
      "text": "Line and Scatter Plot"
    }
  }
}</span>
<span class="p">```</span>
</code></pre></div></div> <p>This is how it looks like:</p> <pre><code class="language-plotly">{
  "data": [
    {
      "x": [1, 2, 3, 4],
      "y": [10, 15, 13, 17],
      "mode": "markers"
    },
    {
      "x": [2, 3, 4, 5],
      "y": [16, 5, 11, 9],
      "mode": "lines"
    },
    {
      "x": [1, 2, 3, 4],
      "y": [12, 9, 15, 12],
      "mode": "lines+markers"
    }
  ],
  "layout": {
    "title": {
      "text": "Line and Scatter Plot"
    }
  }
}
</code></pre>]]></content><author><name></name></author><category term="sample-posts"/><category term="formatting"/><category term="charts"/><summary type="html"><![CDATA[this is what included plotly.js code could look like]]></summary></entry><entry xml:lang="en"><title type="html">a post with image galleries</title><link href="https://chunzhuo.github.io/blog/2024/photo-gallery/" rel="alternate" type="text/html" title="a post with image galleries"/><published>2024-12-04T01:59:00+00:00</published><updated>2024-12-04T01:59:00+00:00</updated><id>https://chunzhuo.github.io/blog/2024/photo-gallery</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2024/photo-gallery/"><![CDATA[<p>The images in this post are all zoomable, arranged into different mini-galleries using different libraries.</p> <h2 id="lightbox2"><a href="https://lokeshdhakar.com/projects/lightbox2/">Lightbox2</a></h2> <p><a href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/1/img-2500.jpg" data-lightbox="roadtrip"><img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/1/img-200.jpg"/></a> <a href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/2/img-2500.jpg" data-lightbox="roadtrip"><img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/2/img-200.jpg"/></a> <a href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/3/img-2500.jpg" data-lightbox="roadtrip"><img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/3/img-200.jpg"/></a></p> <hr/> <h2 id="photoswipe"><a href="https://photoswipe.com/">PhotoSwipe</a></h2> <div class="pswp-gallery pswp-gallery--single-column" id="gallery--getting-started"> <a href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/2/img-2500.jpg" data-pswp-width="1669" data-pswp-height="2500" target="_blank"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/2/img-200.jpg" alt=""/> </a> <a href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/7/img-2500.jpg" data-pswp-width="1875" data-pswp-height="2500" data-cropped="true" target="_blank"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/7/img-200.jpg" alt=""/> </a> <a href="https://unsplash.com" data-pswp-src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/3/img-2500.jpg" data-pswp-width="2500" data-pswp-height="1666" target="_blank"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/3/img-200.jpg" alt=""/> </a> <div> <a href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/6/img-2500.jpg" data-pswp-width="2500" data-pswp-height="1667" target="_blank"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/6/img-200.jpg" alt=""/> </a> </div> </div> <hr/> <h2 id="spotlight-js"><a href="https://nextapps-de.github.io/spotlight/">Spotlight JS</a></h2> <div class="spotlight-group"> <a class="spotlight" href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/1/img-2500.jpg"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/1/img-200.jpg"/> </a> <a class="spotlight" href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/2/img-2500.jpg"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/2/img-200.jpg"/> </a> <a class="spotlight" href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/3/img-2500.jpg"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/3/img-200.jpg"/> </a> </div> <div class="spotlight-group"> <a class="spotlight" href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/4/img-2500.jpg"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/4/img-200.jpg"/> </a> <a class="spotlight" href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/5/img-2500.jpg"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/5/img-200.jpg"/> </a> <a class="spotlight" href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/6/img-2500.jpg"> <img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/6/img-200.jpg"/> </a> </div> <hr/> <h2 id="venobox"><a href="https://veno.es/venobox/">Venobox</a></h2> <p><a class="venobox" data-gall="myGallery" href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/1/img-2500.jpg"><img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/1/img-200.jpg"/></a> <a class="venobox" data-gall="myGallery" href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/2/img-2500.jpg"><img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/2/img-200.jpg"/></a> <a class="venobox" data-gall="myGallery" href="https://cdn.photoswipe.com/photoswipe-demo-images/photos/3/img-2500.jpg"><img src="https://cdn.photoswipe.com/photoswipe-demo-images/photos/3/img-200.jpg"/></a></p>]]></content><author><name></name></author><category term="sample-posts"/><category term="formatting"/><category term="images"/><summary type="html"><![CDATA[this is what included image galleries could look like]]></summary></entry><entry><title type="html">Google Gemini updates: Flash 1.5, Gemma 2 and Project Astra</title><link href="https://chunzhuo.github.io/blog/2024/google-gemini-updates-flash-15-gemma-2-and-project-astra/" rel="alternate" type="text/html" title="Google Gemini updates: Flash 1.5, Gemma 2 and Project Astra"/><published>2024-05-14T00:00:00+00:00</published><updated>2024-05-14T00:00:00+00:00</updated><id>https://chunzhuo.github.io/blog/2024/google-gemini-updates-flash-15-gemma-2-and-project-astra</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2024/google-gemini-updates-flash-15-gemma-2-and-project-astra/"><![CDATA[<p>Learn more:Learn more:Learn more:Learn more:Learn more:Learn more:May 14, 2024 We’re introducing a series of updates across the Gemini family of models, including the new 1.5 Flash, our lightweight model for speed and efficiency, and Project Astra, our vision for the future of AI assistants. In December, we launched our first natively multimodal model Gemini 1.0 in three sizes: Ultra, Pro and Nano. Just a few months later we released 1.5 Pro, with enhanced performance and a breakthrough long context window of 1 million tokens.Developers and enterprise customers have been putting 1.5 Pro to use in incredible ways and finding its long context window, multimodal reasoning capabilities and impressive overall performance incredibly useful.We know from user feedback that some applications need lower latency and a lower cost to serve. This inspired us to keep innovating, so today, we’re introducing Gemini 1.5 Flash: a model that’s lighter-weight than 1.5 Pro, and designed to be fast and efficient to serve at scale.Both 1.5 Pro and 1.5 Flash are available in public preview with a 1 million token context window in Google AI Studio and Vertex AI. And now, 1.5 Pro is also available with a 2 million token context window via waitlist to developers using the API and to Google Cloud customers.We’re also introducing updates across the Gemini family of models, announcing our next generation of open models, Gemma 2, and sharing progress on the future of AI assistants, with Project Astra.Context lengths of leading foundation models compared with Gemini 1.5’s 2 million token capability1.5 Flash is the newest addition to the Gemini model family and the fastest Gemini model served in the API. It’s optimized for high-volume, high-frequency tasks at scale, is more cost-efficient to serve and features our breakthrough long context window.While it’s a lighter weight model than 1.5 Pro, it’s highly capable of multimodal reasoning across vast amounts of information and delivers impressive quality for its size.The new Gemini 1.5 Flash model is optimized for speed and efficiency, is highly capable of multimodal reasoning and features our breakthrough long context window.1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more. This is because it’s been trained by 1.5 Pro through a process called “distillation,” where the most essential knowledge and skills from a larger model are transferred to a smaller, more efficient model.Read more about 1.5 Flash in our updated Gemini 1.5 technical report, on the Gemini technology page, and learn about 1.5 Flash’s availability and pricing.Over the last few months, we’ve significantly improved 1.5 Pro, our best model for general performance across a wide range of tasks.Beyond extending its context window to 2 million tokens, we’ve enhanced its code generation, logical reasoning and planning, multi-turn conversation, and audio and image understanding through data and algorithmic advances. We see strong improvements on public and internal benchmarks for each of these tasks.1.5 Pro can now follow increasingly complex and nuanced instructions, including ones that specify product-level behavior involving role, format and style. We’ve improved control over the model’s responses for specific use cases, like crafting the persona and response style of a chat agent or automating workflows through multiple function calls. And we’ve enabled users to steer model behavior by setting system instructions.We added audio understanding in the Gemini API and Google AI Studio, so 1.5 Pro can now reason across image and audio for videos uploaded in Google AI Studio. And we’re now integrating 1.5 Pro into Google products, including Gemini Advanced and in Workspace apps.Read more about 1.5 Pro in our updated Gemini 1.5 technical report and on the Gemini technology page.Gemini Nano is expanding beyond text-only inputs to include images as well. Starting with Pixel, applications using Gemini Nano with Multimodality will be able to understand the world the way people do — not just through text, but also through sight, sound and spoken language.Read more about Gemini 1.0 Nano on Android.Today, we’re also sharing a series of updates to Gemma, our family of open models built from the same research and technology used to create the Gemini models.We’re announcing Gemma 2, our next generation of open models for responsible AI innovation. Gemma 2 has a new architecture designed for breakthrough performance and efficiency, and will be available in new sizes.The Gemma family is also expanding with PaliGemma, our first vision-language model inspired by PaLI-3. And we’ve upgraded our Responsible Generative AI Toolkit with LLM Comparator for evaluating the quality of model responses.Read more on the Developer blog.As part of Google DeepMind’s mission to build AI responsibly to benefit humanity, we’ve always wanted to develop universal AI agents that can be helpful in everyday life. That’s why today, we’re sharing our progress in building the future of AI assistants with Project Astra (advanced seeing and talking responsive agent).To be truly useful, an agent needs to understand and respond to the complex and dynamic world just like people do — and take in and remember what it sees and hears to understand context and take action. It also needs to be proactive, teachable and personal, so users can talk to it naturally and without lag or delay.While we’ve made incredible progress developing AI systems that can understand multimodal information, getting response time down to something conversational is a difficult engineering challenge. Over the past few years, we’ve been working to improve how our models perceive, reason and converse to make the pace and quality of interaction feel more natural.Building on Gemini, we’ve developed prototype agents that can process information faster by continuously encoding video frames, combining the video and speech input into a timeline of events, and caching this information for efficient recall.By leveraging our leading speech models, we also enhanced how they sound, giving the agents a wider range of intonations. These agents can better understand the context they’re being used in, and respond quickly, in conversation.With technology like this, it’s easy to envision a future where people could have an expert AI assistant by their side, through a phone or glasses. And some of these capabilities are coming to Google products, like the Gemini app and web experience, later this year.We’ve made incredible progress so far with our family of Gemini models, and we’re always striving to advance the state-of-the-art even further. By investing in a relentless production line of innovation, we’re able to explore new ideas at the frontier, while also unlocking the possibility of new and exciting Gemini use cases.Learn more about Gemini and its capabilities. Your information will be used in accordance with Google’s privacy policy.</p> <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>      Done. Just one step more.
    
      Check your inbox to confirm your subscription.
    You are already subscribed to our newsletter.
    You can also subscribe with a
    different email address
    
    .
    
  Let’s stay in touch. Get the latest news from Google in your inbox.
          Follow Us
</code></pre></div></div>]]></content><author><name></name></author><category term="external-posts"/><category term="google"/><summary type="html"><![CDATA[We’re sharing updates across our Gemini family of models and a glimpse of Project Astra, our vision for the future of AI assistants.]]></summary></entry><entry xml:lang="en"><title type="html">a post with tabs</title><link href="https://chunzhuo.github.io/blog/2024/tabs/" rel="alternate" type="text/html" title="a post with tabs"/><published>2024-05-01T00:32:13+00:00</published><updated>2024-05-01T00:32:13+00:00</updated><id>https://chunzhuo.github.io/blog/2024/tabs</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2024/tabs/"><![CDATA[<p>This is how a post with <a href="https://github.com/Ovski4/jekyll-tabs">tabs</a> looks like. Note that the tabs could be used for different purposes, not only for code.</p> <h2 id="first-tabs">First tabs</h2> <p>To add tabs, use the following syntax:</p> <div class="language-liquid highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">{%</span><span class="w"> </span><span class="nt">tabs</span><span class="w"> </span><span class="nv">group-name</span><span class="w"> </span><span class="cp">%}</span>

<span class="cp">{%</span><span class="w"> </span><span class="nt">tab</span><span class="w"> </span><span class="nv">group-name</span><span class="w"> </span><span class="nv">tab-name-1</span><span class="w"> </span><span class="cp">%}</span>

Content 1

<span class="cp">{%</span><span class="w"> </span><span class="nt">endtab</span><span class="w"> </span><span class="cp">%}</span>

<span class="cp">{%</span><span class="w"> </span><span class="nt">tab</span><span class="w"> </span><span class="nv">group-name</span><span class="w"> </span><span class="nv">tab-name-2</span><span class="w"> </span><span class="cp">%}</span>

Content 2

<span class="cp">{%</span><span class="w"> </span><span class="nt">endtab</span><span class="w"> </span><span class="cp">%}</span>

<span class="cp">{%</span><span class="w"> </span><span class="nt">endtabs</span><span class="w"> </span><span class="cp">%}</span>
</code></pre></div></div> <p>With this you can generate visualizations like:</p> <ul id="log" class="tab" data-tab="6c3bca0a-f9f8-461c-a67c-19a51ba7f39c" data-name="log"> <li class="active" id="log-php"> <a href="#">php </a> </li> <li id="log-js"> <a href="#">js </a> </li> <li id="log-ruby"> <a href="#">ruby </a> </li> </ul> <ul class="tab-content" id="6c3bca0a-f9f8-461c-a67c-19a51ba7f39c" data-name="log"> <li class="active"> <div class="language-php highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">var_dump</span><span class="p">(</span><span class="s1">'hello'</span><span class="p">);</span>
</code></pre></div></div> </li> <li> <div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">console</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="dl">"</span><span class="s2">hello</span><span class="dl">"</span><span class="p">);</span>
</code></pre></div></div> </li> <li> <div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nx">pputs</span> <span class="dl">'</span><span class="s1">hello</span><span class="dl">'</span>
</code></pre></div></div> </li> </ul> <h2 id="another-example">Another example</h2> <ul id="data-struct" class="tab" data-tab="c350d409-567c-41ea-8b3d-8f3776217c82" data-name="data-struct"> <li class="active" id="data-struct-yaml"> <a href="#">yaml </a> </li> <li id="data-struct-json"> <a href="#">json </a> </li> </ul> <ul class="tab-content" id="c350d409-567c-41ea-8b3d-8f3776217c82" data-name="data-struct"> <li class="active"> <div class="language-yaml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="na">hello</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="s2">"</span><span class="s">whatsup"</span>
  <span class="pi">-</span> <span class="s2">"</span><span class="s">hi"</span>
</code></pre></div></div> </li> <li> <div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"hello"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"whatsup"</span><span class="p">,</span><span class="w"> </span><span class="s2">"hi"</span><span class="p">]</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div> </li> </ul> <h2 id="tabs-for-something-else">Tabs for something else</h2> <ul id="something-else" class="tab" data-tab="82044b73-cd88-413a-a8eb-a23e63dd35b8" data-name="something-else"> <li class="active" id="something-else-text"> <a href="#">text </a> </li> <li id="something-else-quote"> <a href="#">quote </a> </li> <li id="something-else-list"> <a href="#">list </a> </li> </ul> <ul class="tab-content" id="82044b73-cd88-413a-a8eb-a23e63dd35b8" data-name="something-else"> <li class="active"> <p>Regular text</p> </li> <li> <blockquote> <p>A quote</p> </blockquote> </li> <li> <p>Hipster list</p> <ul> <li>brunch</li> <li>fixie</li> <li>raybans</li> <li>messenger bag</li> </ul> </li> </ul>]]></content><author><name></name></author><category term="sample-posts"/><category term="formatting"/><category term="code"/><summary type="html"><![CDATA[this is what included tabs in a post could look like]]></summary></entry><entry xml:lang="en"><title type="html">a post with typograms</title><link href="https://chunzhuo.github.io/blog/2024/typograms/" rel="alternate" type="text/html" title="a post with typograms"/><published>2024-04-29T23:36:10+00:00</published><updated>2024-04-29T23:36:10+00:00</updated><id>https://chunzhuo.github.io/blog/2024/typograms</id><content type="html" xml:base="https://chunzhuo.github.io/blog/2024/typograms/"><![CDATA[<p>This is an example post with some <a href="https://github.com/google/typograms/">typograms</a> code.</p> <div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">```</span><span class="nl">typograms
</span><span class="sb">+----+
|    |---&gt; My first diagram!
+----+</span>
<span class="p">```</span>
</code></pre></div></div> <p>Which generates:</p> <pre><code class="language-typograms">+----+
|    |---&gt; My first diagram!
+----+
</code></pre> <p>Another example:</p> <div class="language-markdown highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">```</span><span class="nl">typograms
</span><span class="sb">.------------------------.
|.----------------------.|
||"https://example.com" ||
|'----------------------'|
| ______________________ |
||                      ||
||   Welcome!           ||
||                      ||
||                      ||
||  .----------------.  ||
||  | username       |  ||
||  '----------------'  ||
||  .----------------.  ||
||  |"*******"       |  ||
||  '----------------'  ||
||                      ||
||  .----------------.  ||
||  |   "Sign-up"    |  ||
||  '----------------'  ||
||                      ||
|+----------------------+|
.------------------------.</span>
<span class="p">```</span>
</code></pre></div></div> <p>which generates:</p> <pre><code class="language-typograms">.------------------------.
|.----------------------.|
||"https://example.com" ||
|'----------------------'|
| ______________________ |
||                      ||
||   Welcome!           ||
||                      ||
||                      ||
||  .----------------.  ||
||  | username       |  ||
||  '----------------'  ||
||  .----------------.  ||
||  |"*******"       |  ||
||  '----------------'  ||
||                      ||
||  .----------------.  ||
||  |   "Sign-up"    |  ||
||  '----------------'  ||
||                      ||
|+----------------------+|
.------------------------.
</code></pre> <p>For more examples, check out the <a href="https://google.github.io/typograms/#examples">typograms documentation</a>.</p>]]></content><author><name></name></author><category term="sample-posts"/><category term="formatting"/><category term="diagrams"/><summary type="html"><![CDATA[this is what included typograms code could look like]]></summary></entry></feed>