Chunzhuo Zhang

Single-cell Perturb-seq CRISPRi

2026-05-11T00:00:00+00:00

CRISPRi is a useful perturbation because it behaves like a dimmer switch: the guide RNA brings a catalytically inactive Cas9 repressor to a regulatory region, and transcription drops without making a DNA double-strand break. Perturb-seq adds a pooled single-cell readout, so each cell carries both a perturbation identity and a transcriptome.

The interactive view below is one continuous 3D cell, not a sequence of separate plots. Drag the cell to rotate it, scroll to zoom, or use the focus buttons to move from the whole cell into the chromatin zone, the open sgRNA target sequence, and the transcript readout.

Drag to rotate · Scroll to zoom

Whole cell

One perturbed cell remains in view: membrane, nucleus, organelles, sgRNA cargo, mRNA molecules, and capture barcode are all part of the same scene.

What the experiment measures

The key output is not only whether a target gene went down. The useful object is a table where every row is a single cell, every cell has a guide assignment, and every column is a measured gene. That lets us ask whether perturbing one regulator shifts cells toward another state, suppresses a pathway, changes response to stimulation, or creates a subtle expression program that would be invisible in a bulk assay.

Why CRISPRi fits this readout

CRISPRi is especially useful when complete knockout is too harsh or when multiple perturbations would create too many DNA breaks. Because it represses transcription through dCas9-KRAB rather than cutting DNA, it can be paired with pooled single-cell screens where the phenotype is a transcriptome, not just growth.

Minimal protocol logic

Build or obtain a cell line expressing CRISPRi machinery.
Introduce a pooled sgRNA library at controlled multiplicity.
Select and culture cells long enough for repression.
Capture single cells and prepare transcriptome plus guide libraries.
Sequence, assign guides to cells, and quantify expression.
Compare each perturbation against controls and visualize response programs.

AI Daily Sprouts | 2026-05-10

2026-05-10T00:00:00+00:00

Search date: 2026-05-10. Window used: roughly the last 7 days.

Top items

Google released Gemma 4 multi-token prediction drafters

Date: 2026-05-05
Source: Google
Type: open-model inference release

Google released Multi-Token Prediction drafters for Gemma 4. The drafters use speculative decoding: a smaller draft component predicts several future tokens, then the main model verifies them in parallel. Google reports up to a 3x speedup without degrading output quality or reasoning logic.

Why it matters: this targets a practical bottleneck for local, edge, and workstation LLM deployment. The bottleneck is often token-by-token latency rather than only raw model capability.

Caveat: the speed and quality claims are vendor-reported and hardware-dependent; independent deployment measurements will matter.

Microsoft framed “Frontier Firms” around human-agent operating models

Date: 2026-05-05
Source: Microsoft
Type: enterprise AI / agent workflow update

Microsoft described a progression from authoring with AI to editing, directing, and orchestrating AI agents, and tied that model to expanded Copilot Cowork capabilities. The operating-model framing is useful because it moves the conversation from “does an assistant help?” to “how do governed agents run work across systems?”

Caveat: this is an official product and strategy narrative, not an independent productivity study.

Recent papers

LLMs Improving LLMs: Agentic Discovery for Test-Time Scaling

Date: 2026-05-08
Source: arXiv:2605.08083
Type: preprint

AutoTTS reframes test-time scaling as a controller-synthesis problem. Instead of manually choosing when a model should branch, continue, probe, prune, or stop, the method searches over inference policies using pre-collected reasoning trajectories and probe signals.

The authors report better accuracy-cost tradeoffs on math reasoning benchmarks, generalization to held-out benchmarks and model scales, and a discovery cost of about $39.90 and 160 minutes. The practical caveat is that this is still a new preprint; the promised code release should be checked before treating it as deployable infrastructure.

Fast Byte Latent Transformer

Date: 2026-05-08
Source: arXiv:2605.08044
Type: preprint

Byte-level language models avoid fixed subword vocabularies, but byte-by-byte decoding is slow. This paper introduces BLT Diffusion, BLT Self-speculation, and BLT Diffusion+Verification so byte-level models can generate multiple bytes per step or verify drafted bytes efficiently.

The authors report that the approaches can reduce estimated memory-bandwidth cost by more than 50% on generation tasks. The next test is whether these methods hold up in real serving stacks and downstream applications.

VecCISC: Improving Confidence-Informed Self-Consistency with Reasoning Trace Clustering and Candidate Answer Selection

Date: 2026-05-08
Source: arXiv:2605.08070
Type: ACL 2026 Findings paper

Confidence-weighted self-consistency can improve reasoning, but it is expensive when a critic model must score every sampled reasoning trace. VecCISC reduces that cost by clustering and filtering traces that are semantically equivalent, degenerate, or hallucinated before calling the critic.

The paper reports a 47% token reduction while maintaining or exceeding CISC accuracy across math, chemistry, biology, commonsense, and humanities datasets. The main caveat is domain transfer: trace similarity and critic behavior can vary sharply by task.

SCOPE: Structured Decomposition and Conditional Skill Orchestration for Complex Image Generation

Date: 2026-05-08
Source: arXiv:2605.08043
Type: preprint

SCOPE attacks a familiar image-generation failure mode: complex prompts contain many visual commitments, and systems can lose track of them across grounding, generation, and verification. The method keeps those commitments in an evolving structured specification, then conditionally invokes retrieval, reasoning, and repair skills.

The paper introduces Gen-Arena and reports stronger commitment-level intent realization than evaluated baselines, including 0.60 EGIP on Gen-Arena. The broader significance depends on whether the benchmark and metric gain independent use.

Beyond Pairs: Your Language Model is Secretly Optimizing a Preference Graph

Date: 2026-05-08
Source: arXiv:2605.08037
Type: preprint

GraphDPO argues that pairwise DPO throws away useful structure when each prompt has multiple ranked rollouts. The method represents ranked responses as a directed acyclic preference graph and optimizes a graph-structured objective while keeping linear per-prompt complexity.

The authors report stronger results on reasoning and program-synthesis tasks than pairwise or listwise alternatives. As with most preference-optimization work, robustness will depend heavily on preference-data quality and replication across model families.

Watch list

Inference efficiency is the dominant theme today: Gemma 4 drafters, Fast BLT, AutoTTS, and VecCISC all reduce latency, token cost, or search cost rather than only increasing model size.
Agent workflows are converging on orchestration: enterprise products and research systems both emphasize delegated subtasks, verification, and repair loops.
New evaluation surfaces such as Gen-Arena are worth watching if they become common baselines rather than one-off paper artifacts.

AI Daily Sprouts | 2026-05-09

2026-05-09T00:00:00+00:00

Search date: 2026-05-09. Window used: roughly the last 7-14 days, with one slightly older paper included because it directly relates to agent skill learning.

Top items

OpenAI released new realtime voice models for the API

Date: 2026-05-07
Source: OpenAI
Type: product release

OpenAI introduced GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper for live voice reasoning, translation, and streaming transcription. Voice agents are moving from turn-taking demos toward tool-using, multilingual, realtime workflows. The 128K context window for GPT-Realtime-2 also makes longer voice sessions more practical.

Caveat: the performance claims are vendor-reported; production behavior still depends heavily on latency, tool design, and domain-specific evaluation.

OpenAI made GPT-5.5 Instant the default ChatGPT model

Date: 2026-05-05
Source: OpenAI
Supporting source: OpenAI system card
Type: model release and safety publication

GPT-5.5 Instant became ChatGPT’s default model, with OpenAI reporting fewer hallucinated claims than GPT-5.3 Instant, especially on high-stakes prompts. The main direction is reliability rather than only raw capability: lower hallucination rates, better image/STEM handling, improved search decisions, and more transparent personalization controls.

Caveat: the hallucination reductions are from OpenAI’s internal evaluations; independent replication would be useful.

Google DeepMind highlighted AlphaEvolve’s broader impact

Date: 2026-05-07
Source: Google DeepMind
Type: research and deployment update

DeepMind reported AlphaEvolve applications across genomics, grid optimization, quantum circuits, mathematics, TPU design, storage systems, logistics, ads, and materials/life-science modeling. This is a strong signal that LLM-powered algorithm discovery is becoming operational infrastructure, not just a research demo.

Caveat: many claims are application-specific and come from Google or partner deployments; the generality of the approach depends on whether problems have reliable automated evaluators.

U.S. CAISI expanded frontier AI model testing agreements

Date: 2026-05-05
Source: NIST / CAISI
Supporting source: Microsoft
Type: policy / safety governance

CAISI announced agreements with Google DeepMind, Microsoft, and xAI for pre-deployment evaluations and targeted research on frontier AI capabilities and security risks. Frontier model assessment is becoming more formalized, especially for cybersecurity, biosecurity, chemical-risk, and national-security concerns.

Caveat: these are collaborative testing agreements, not a full public regulatory regime; details of model access, evaluation criteria, and enforcement remain limited.

Anthropic expanded compute capacity and Claude usage limits

Date: 2026-05-06
Source: Anthropic
Type: infrastructure / product capacity

Anthropic announced a SpaceX compute partnership and higher Claude Code/API usage limits, including doubled five-hour Claude Code limits for several paid plans. Capacity is still a strategic bottleneck for frontier AI products. More compute directly affects developer workflows, API availability, and model deployment scale.

Anthropic announced an enterprise AI services company

Date: 2026-05-04
Source: Anthropic
Type: enterprise AI deployment

Anthropic, Blackstone, Hellman & Friedman, and Goldman Sachs announced a new AI services company focused on helping mid-sized companies deploy Claude in core operations. Frontier labs are moving deeper into implementation services, not only model/API distribution.

Recent papers and benchmarks

Claw-Eval-Live: A Live Agent Benchmark for Evolving Real-World Workflows

Date: 2026-05-01
Source: ChatPaper summary
Type: agent benchmark paper

Static agent benchmarks age quickly and often grade final answers without verifying whether the agent actually executed a workflow. Claw-Eval-Live separates a refreshable signal layer from reproducible, timestamped release snapshots so agent tasks can evolve with real workflow demand.

Caveat: I found a secondary paper page during this quick run; for a deeper digest, verify against the arXiv page or project repository.

SkillLearnBench: Benchmarking Continual Learning Methods for Agent Skill Generation on Real-World Tasks

Date: 2026-04-22
Source: Emergent Mind paper page
Type: agent learning benchmark paper

Skills are increasingly used to make agents reliable on complex tasks, but automatically generating and improving those skills is still uneven. This benchmark evaluates continual skill learning across 20 verified tasks and measures skill quality, execution trajectory, and task outcome.

Watch list

Voice agents are becoming more tool-oriented and production-shaped.
Frontier-model evaluation is shifting toward government-lab collaboration before deployment.
Agent benchmarks are increasingly emphasizing live workflows, verification, and changing environments.
Algorithm-discovery agents such as AlphaEvolve are moving from research examples into infrastructure and commercial optimization.

bioAI Daily Sprouts | 2026-05-09

2026-05-09T00:00:00+00:00

Search date: 2026-05-09. Window: 2026-04-09 to 2026-05-09. Sources prioritized: Nature Biotechnology and Nature Methods publisher pages, with peer-reviewed articles and major reviews favored over news items.

Papers

Digital twins of ex vivo human lungs enable accurate and personalized evaluation of therapeutic efficacy Nature Biotechnology, 2026-05-04. DOI/link Summary: Builds data-rich human lung digital twins from ex vivo lung perfusion, integrating physiology, imaging, transcriptomics, metabolomics and proteomics to forecast organ behavior and therapeutic response. Why it matters: It shows how organ-scale digital twins can be anchored in prospective human-organ measurements rather than purely retrospective clinical modeling. Tags: digital twins; translational biology; precision medicine; computational biology
TxPert: using multiple knowledge graphs for prediction of transcriptomic perturbation effects Nature Biotechnology, 2026-05-01. DOI/link Summary: Introduces a deep learning framework that combines basal transcriptomic state encoding with multiple biological knowledge graphs to predict out-of-distribution genetic perturbation responses. Why it matters: Perturbation prediction is central to model-guided experiments and drug discovery, and this paper explicitly benchmarks against strong nonlearned baselines and experimental reproducibility. Tags: AI4Bio; perturbation prediction; transcriptomics; knowledge graphs; machine learning
DNA-guided CRISPR-Cas12a effectors for programmable RNA recognition and cleavage Nature Biotechnology, 2026-05-01. DOI/link Summary: Reprograms Cas12a into a DNA-guided, RNA-targeting effector and demonstrates direct RNA detection plus intracellular RNA knockdown. Why it matters: The work expands programmable nucleic-acid engineering beyond canonical RNA-guided CRISPR architectures and creates new design space for RNA diagnostics and manipulation. Tags: CRISPR; RNA; synthetic biology; diagnostics; biotechnology
Single-molecule localization and diffusivity microscopy reveals dynamic biomolecular organization in living cells Nature Methods, 2026-04-28. DOI/link Summary: Presents SMLDM, a deep learning-enabled microscopy method that estimates molecule movement and diffusion from single-frame snapshots without trajectory linking. Why it matters: It sharply increases mapping density for live-cell molecular dynamics, helping connect spatial organization with mobility in chromatin, receptors, adhesions and condensates. Tags: bioimage informatics; deep learning; microscopy; single-molecule biophysics
Systematically decoding pathological morphologies and molecular profiles with unified multimodal embedding Nature Methods, 2026-04-24. DOI/link Summary: Introduces Multi-Embed, an interpretable multimodal framework for linking pathology morphology with multilayer molecular profiles. Why it matters: Computational pathology is moving from image-only predictors toward morphology-to-molecular reasoning that can support mechanistic disease interpretation. Tags: computational pathology; multimodal learning; molecular profiling; machine learning
Direct RNA sequencing and signal alignment reveal RNA structure ensembles in a eukaryotic cell Nature Methods, 2026-04-24. DOI/link Summary: Combines chemical probing, direct RNA sequencing and signal alignment to map RNA structural ensembles at single-molecule resolution in eukaryotic cells. Why it matters: It turns raw direct-sequencing signal into a richer readout of RNA structural heterogeneity, connecting transcript sequence, isoforms and regulatory structure. Tags: RNA structure; direct RNA sequencing; transcriptomics; computational biology
High-fidelity intravital imaging of biological dynamics with latent-space-enhanced digital adaptive optics Nature Biotechnology, 2026-04-23. DOI/link Summary: Develops latent-space-enhanced digital adaptive optics for intravital fluorescence microscopy, using wave-optics priors in spatial-angular data to improve aberration estimation. Why it matters: Better computational correction can make in vivo immune, neural and injury imaging more quantitative without relying only on expensive custom hardware. Tags: bioimage informatics; microscopy; latent representations; computational imaging
Orthrus: toward evolutionary and functional RNA foundation models Nature Methods, 2026-04-17. DOI/link Summary: Builds an RNA foundation-model direction aimed at learning evolutionary and functional representations across RNA sequences. Why it matters: RNA language models are becoming a parallel track to protein language models, with potential utility in RNA biology, functional prediction and therapeutic design. Tags: AI4Bio; RNA; foundation models; sequence modeling; transcriptomics
Artificial allosteric protein switches with machine-learning-designed receptors Nature Biotechnology, 2026-04-15. DOI/link Summary: Shows that machine-learning-designed ligand-binding domains can act as receptors in artificial allosteric protein switches and biosensors. Why it matters: It links generative protein design to working synthetic-biology devices, including logic gates, engineered cells and bioelectronic hormone sensing. Tags: protein design; synthetic biology; biosensors; AI4Bio
Inducible, split base editors for in vivo cancer functional genomics Nature Biotechnology, 2026-04-15. DOI/link Summary: Designs split, inducible base editors for controlled in vivo cancer functional genomics, reducing constraints from constitutively active deaminase systems. Why it matters: More controllable base-editing screens can improve mutation-level functional genomics in animal models and better separate target effects from editor toxicity. Tags: genome editing; base editors; cancer genomics; functional genomics
Adaptive optical correction for in vivo two-photon fluorescence microscopy with neural fields Nature Methods, 2026-04-13. DOI/link Summary: Uses neural fields to perform adaptive optical correction for in vivo two-photon microscopy under motion and sample-induced aberration. Why it matters: Neural representations are becoming useful infrastructure for biological imaging, especially when hardware-only correction is difficult or fragile. Tags: bioimage informatics; neural fields; microscopy; neuroscience; software

Watch list

Perturbation modeling is maturing: papers now spend more space on realistic out-of-distribution tasks, baselines and reproducibility ceilings.
RNA-focused foundation models and direct RNA signal analysis are both advancing, suggesting stronger computational tools for RNA function and RNA therapeutics.
Bioimage informatics is shifting toward latent representations, neural fields and deep-learning-assisted physical correction rather than segmentation alone.
Experimentally grounded AI4Bio remains the strongest signal: the most useful papers combine model advances with organ systems, live-cell imaging, CRISPR tools or protein engineering validation.

Multimodality for Biology

2026-05-07T00:00:00+00:00

In single-cell and broader computational biology, “multimodality” comes in many flavors — DNA sequence, RNA expression, chromatin accessibility, protein levels, perturbation responses, knowledge graphs, text. The hard part is rarely listing the modalities; it is choosing how to fuse them.

These are notes from a recent talk where I tried to organize the landscape into three approaches: bottom-up, parallel, and uniform. Each makes a different bet about where biological structure lives and where modalities should meet inside the model.

Multimodality tasks

Before fixing on an architecture it helps to be explicit about the tasks we want a multimodal biological model to do — cross-modal prediction, perturbation response, cell-state inference, sequence-to-function, and so on. Different tasks pull architecture in different directions, and the rest of this post only makes sense relative to what we are asking the model to predict.

Bottom-up approach

The bottom-up approach builds representations along the natural hierarchy of biology: molecular → cellular → multicellular. UCE-style models learn cell embeddings from gene-level tokens; models like PULSAR push further toward tissue- and multicellular-level structure. Each tier is trained on what is plentiful at that scale, and the next tier inherits its substrate from below.

The advantage is that each level is interpretable on its own terms and can be pretrained independently. The cost is that errors and biases compound as you climb the hierarchy.

From sequence to perturbation

A concrete instance of the bottom-up program: start from genomic sequence and train representations that transfer downstream to perturbation prediction. The chain is sequence → expression → response, and the architectural question is at which level multimodal signals should enter.

Parallel approach

The parallel approach treats modalities as roughly co-equal and combines per-modality embeddings at the input. A canonical case: take a DNA sequence and seven epigenetic tracks, embed each independently, and directly sum the eight embeddings. Everything downstream sees a single fused vector.

This is cheap, easy to scale modality-by-modality, and trivial to extend with a new track. The price is that direct summation assumes all modalities live in the same metric space — which is rarely true biologically.

Separate encoder per modality

A more careful variant: keep one encoder per modality and fuse later. Each encoder can use whatever tokenization and inductive bias suits its data type, and fusion happens through concatenation, cross-attention, or gating — no longer at the input.

Different knowledge sources

Beyond raw signals, multimodal can mean fusing different kinds of knowledge: an LLM for textual context, a knowledge graph for curated relations, tabular features for engineered priors. Two pooling strategies show up repeatedly:

Global pooling — a weighted average of source embeddings.
Attention-based pooling — let the query decide which source matters.

The latter usually wins when the relevance of each source varies across examples.

Uniform approach

The uniform approach goes the other direction from per-modality encoders: serialize multiple sequences into a single stream and let one model digest them all. Sequence-related tasks (DNA, RNA, protein) are a natural fit — they already share a token-stream shape.

The simplicity is appealing — one model, one loss, no fusion module. The hard part is teaching a single model to respect the very different statistics of, say, codon usage versus regulatory motifs.

Relational transformer for biology

The architecture I am most interested in is a relational transformer: instead of forcing modalities through a single fusion bottleneck, represent biological entities (genes, cells, regions) as nodes and let attention range over typed relations between them.

Details

Two attention patterns carry most of the weight:

Relational attention — for complementary modalities, where each modality contributes information the others do not. Attention selects across modalities at each layer.
Hierarchical attention — for hierarchical modalities, where the structure itself is nested (region → gene → cell → tissue). Attention is constrained by that hierarchy.

Two open problems I keep running into:

Memory constraint. Cross-modality attention is quadratic in token count, and biological inputs are long.
Coupling-data constraint. Training relational attention requires examples where modalities are observed together, and truly paired multimodal datasets at scale are still rare.

These are the bottlenecks I think the next round of work — mine and others’ — needs to address.

References

Liang, P. P., Zadeh, A., & Morency, L.-P. (2024). Foundations & Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions. ACM Computing Surveys, 56(10). https://doi.org/10.1145/3656580
Rosen, Y., Roohani, Y., Agrawal, A., Samotorcan, L., Quake, S. R., & Leskovec, J. (2023). Universal Cell Embeddings: A Foundation Model for Cell Biology. BioRxiv. https://doi.org/10.1101/2023.11.28.568918
Pang, K., Rosen, Y., Kedzierska, K., He, Z., Rajagopal, A., Gustafson, C. E., Huynh, G., & Leskovec, J. (2025). PULSAR: a Foundation Model for Multi-scale and Multicellular Biology. BioRxiv. https://doi.org/10.1101/2025.11.24.685470
Fu, B., Dasoulas, G., Gabbita, S., Lin, X., Gao, S., Su, X., Ghosh, S., & Zitnik, M. (2026). STRAND: Sequence-Conditioned Transport for Single-Cell Perturbations. ArXiv Preprint ArXiv:2602.10156. https://arxiv.org/abs/2602.10156
Yang, Z., Fan, X., Lan, M., Tang, X., Zheng, Z., Liu, B., You, Y., Tian, L., Church, G., Liu, X., & Gu, F. (2024). Multimodal foundation model predicts zero-shot functional perturbations and cell fate dynamics. BioRxiv. https://doi.org/10.1101/2024.12.19.629561
Yang, X., Liu, G., Feng, G., Bu, D., Wang, P., & others. (2023). GeneCompass: Deciphering Universal Gene Regulatory Mechanisms with Knowledge-Informed Cross-Species Foundation Model. BioRxiv. https://doi.org/10.1101/2023.09.26.559542
Littman, R., Levine, J., Maleki, S., Lee, Y., Ermakov, V., Qiu, L., Wu, A., Huang, K., Lopez, R., Scalia, G., Biancalani, T., Richmond, D., Regev, A., & Hütter, J.-C. (2025). Gene-embedding-based prediction and functional evaluation of perturbation expression responses with PRESAGE. BioRxiv. https://doi.org/10.1101/2025.06.03.657653
Golkar, S., Kovalic, J., Espejo Morales, I., Sledzieski, S., Cho, K., Cranmer, M., Ho, S., & others. (2026). MIMIC: A Generative Multimodal Foundation Model for Biomolecules. ArXiv Preprint ArXiv:2604.24506. https://arxiv.org/abs/2604.24506
Ranjan, R., Hudovernik, V., Znidar, M., Kanatsoulis, C., Upendra, R., Mohammadi, M., Meyer, J., Palczewski, T., Guestrin, C., & Leskovec, J. (2025). Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data. ArXiv Preprint ArXiv:2510.06377. https://arxiv.org/abs/2510.06377

a post with plotly.js

2025-03-26T14:24:00+00:00

This is an example post with some plotly code.

```plotly
{
  "data": [
    {
      "x": [1, 2, 3, 4],
      "y": [10, 15, 13, 17],
      "type": "scatter"
    },
    {
      "x": [1, 2, 3, 4],
      "y": [16, 5, 11, 9],
      "type": "scatter"
    }
  ]
}
```

Which generates:

{
  "data": [
    {
      "x": [1, 2, 3, 4],
      "y": [10, 15, 13, 17],
      "type": "scatter"
    },
    {
      "x": [1, 2, 3, 4],
      "y": [16, 5, 11, 9],
      "type": "scatter"
    }
  ]
}

Also another example chart.

```plotly
{
  "data": [
    {
      "x": [1, 2, 3, 4],
      "y": [10, 15, 13, 17],
      "mode": "markers"
    },
    {
      "x": [2, 3, 4, 5],
      "y": [16, 5, 11, 9],
      "mode": "lines"
    },
    {
      "x": [1, 2, 3, 4],
      "y": [12, 9, 15, 12],
      "mode": "lines+markers"
    }
  ],
  "layout": {
    "title": {
      "text": "Line and Scatter Plot"
    }
  }
}
```

This is how it looks like:

{
  "data": [
    {
      "x": [1, 2, 3, 4],
      "y": [10, 15, 13, 17],
      "mode": "markers"
    },
    {
      "x": [2, 3, 4, 5],
      "y": [16, 5, 11, 9],
      "mode": "lines"
    },
    {
      "x": [1, 2, 3, 4],
      "y": [12, 9, 15, 12],
      "mode": "lines+markers"
    }
  ],
  "layout": {
    "title": {
      "text": "Line and Scatter Plot"
    }
  }
}

a post with image galleries

2024-12-04T01:59:00+00:00

The images in this post are all zoomable, arranged into different mini-galleries using different libraries.

Lightbox2

PhotoSwipe

Spotlight JS

Venobox

Google Gemini updates: Flash 1.5, Gemma 2 and Project Astra

2024-05-14T00:00:00+00:00

Learn more:Learn more:Learn more:Learn more:Learn more:Learn more:May 14, 2024 We’re introducing a series of updates across the Gemini family of models, including the new 1.5 Flash, our lightweight model for speed and efficiency, and Project Astra, our vision for the future of AI assistants. In December, we launched our first natively multimodal model Gemini 1.0 in three sizes: Ultra, Pro and Nano. Just a few months later we released 1.5 Pro, with enhanced performance and a breakthrough long context window of 1 million tokens.Developers and enterprise customers have been putting 1.5 Pro to use in incredible ways and finding its long context window, multimodal reasoning capabilities and impressive overall performance incredibly useful.We know from user feedback that some applications need lower latency and a lower cost to serve. This inspired us to keep innovating, so today, we’re introducing Gemini 1.5 Flash: a model that’s lighter-weight than 1.5 Pro, and designed to be fast and efficient to serve at scale.Both 1.5 Pro and 1.5 Flash are available in public preview with a 1 million token context window in Google AI Studio and Vertex AI. And now, 1.5 Pro is also available with a 2 million token context window via waitlist to developers using the API and to Google Cloud customers.We’re also introducing updates across the Gemini family of models, announcing our next generation of open models, Gemma 2, and sharing progress on the future of AI assistants, with Project Astra.Context lengths of leading foundation models compared with Gemini 1.5’s 2 million token capability1.5 Flash is the newest addition to the Gemini model family and the fastest Gemini model served in the API. It’s optimized for high-volume, high-frequency tasks at scale, is more cost-efficient to serve and features our breakthrough long context window.While it’s a lighter weight model than 1.5 Pro, it’s highly capable of multimodal reasoning across vast amounts of information and delivers impressive quality for its size.The new Gemini 1.5 Flash model is optimized for speed and efficiency, is highly capable of multimodal reasoning and features our breakthrough long context window.1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more. This is because it’s been trained by 1.5 Pro through a process called “distillation,” where the most essential knowledge and skills from a larger model are transferred to a smaller, more efficient model.Read more about 1.5 Flash in our updated Gemini 1.5 technical report, on the Gemini technology page, and learn about 1.5 Flash’s availability and pricing.Over the last few months, we’ve significantly improved 1.5 Pro, our best model for general performance across a wide range of tasks.Beyond extending its context window to 2 million tokens, we’ve enhanced its code generation, logical reasoning and planning, multi-turn conversation, and audio and image understanding through data and algorithmic advances. We see strong improvements on public and internal benchmarks for each of these tasks.1.5 Pro can now follow increasingly complex and nuanced instructions, including ones that specify product-level behavior involving role, format and style. We’ve improved control over the model’s responses for specific use cases, like crafting the persona and response style of a chat agent or automating workflows through multiple function calls. And we’ve enabled users to steer model behavior by setting system instructions.We added audio understanding in the Gemini API and Google AI Studio, so 1.5 Pro can now reason across image and audio for videos uploaded in Google AI Studio. And we’re now integrating 1.5 Pro into Google products, including Gemini Advanced and in Workspace apps.Read more about 1.5 Pro in our updated Gemini 1.5 technical report and on the Gemini technology page.Gemini Nano is expanding beyond text-only inputs to include images as well. Starting with Pixel, applications using Gemini Nano with Multimodality will be able to understand the world the way people do — not just through text, but also through sight, sound and spoken language.Read more about Gemini 1.0 Nano on Android.Today, we’re also sharing a series of updates to Gemma, our family of open models built from the same research and technology used to create the Gemini models.We’re announcing Gemma 2, our next generation of open models for responsible AI innovation. Gemma 2 has a new architecture designed for breakthrough performance and efficiency, and will be available in new sizes.The Gemma family is also expanding with PaliGemma, our first vision-language model inspired by PaLI-3. And we’ve upgraded our Responsible Generative AI Toolkit with LLM Comparator for evaluating the quality of model responses.Read more on the Developer blog.As part of Google DeepMind’s mission to build AI responsibly to benefit humanity, we’ve always wanted to develop universal AI agents that can be helpful in everyday life. That’s why today, we’re sharing our progress in building the future of AI assistants with Project Astra (advanced seeing and talking responsive agent).To be truly useful, an agent needs to understand and respond to the complex and dynamic world just like people do — and take in and remember what it sees and hears to understand context and take action. It also needs to be proactive, teachable and personal, so users can talk to it naturally and without lag or delay.While we’ve made incredible progress developing AI systems that can understand multimodal information, getting response time down to something conversational is a difficult engineering challenge. Over the past few years, we’ve been working to improve how our models perceive, reason and converse to make the pace and quality of interaction feel more natural.Building on Gemini, we’ve developed prototype agents that can process information faster by continuously encoding video frames, combining the video and speech input into a timeline of events, and caching this information for efficient recall.By leveraging our leading speech models, we also enhanced how they sound, giving the agents a wider range of intonations. These agents can better understand the context they’re being used in, and respond quickly, in conversation.With technology like this, it’s easy to envision a future where people could have an expert AI assistant by their side, through a phone or glasses. And some of these capabilities are coming to Google products, like the Gemini app and web experience, later this year.We’ve made incredible progress so far with our family of Gemini models, and we’re always striving to advance the state-of-the-art even further. By investing in a relentless production line of innovation, we’re able to explore new ideas at the frontier, while also unlocking the possibility of new and exciting Gemini use cases.Learn more about Gemini and its capabilities. Your information will be used in accordance with Google’s privacy policy.

      Done. Just one step more.
    
      Check your inbox to confirm your subscription.
    You are already subscribed to our newsletter.
    You can also subscribe with a
    different email address
    
    .
    
  Let’s stay in touch. Get the latest news from Google in your inbox.
          Follow Us

a post with tabs

2024-05-01T00:32:13+00:00

This is how a post with tabs looks like. Note that the tabs could be used for different purposes, not only for code.

First tabs

To add tabs, use the following syntax:

{% tabs group-name %}

{% tab group-name tab-name-1 %}

Content 1

{% endtab %}

{% tab group-name tab-name-2 %}

Content 2

{% endtab %}

{% endtabs %}

With this you can generate visualizations like:

```
var_dump('hello');
```
```
console.log("hello");
```
```
pputs 'hello'
```

Another example

yaml
json

```
hello:
  - "whatsup"
  - "hi"
```
```
{
  "hello": ["whatsup", "hi"]
}
```

Tabs for something else

Regular text
A quote
Hipster list
- brunch
- fixie
- raybans
- messenger bag

a post with typograms

2024-04-29T23:36:10+00:00

This is an example post with some typograms code.

```typograms
+----+
|    |---> My first diagram!
+----+
```

Which generates:

+----+
|    |---> My first diagram!
+----+

Another example:

```typograms
.------------------------.
|.----------------------.|
||"https://example.com" ||
|'----------------------'|
| ______________________ |
||                      ||
||   Welcome!           ||
||                      ||
||                      ||
||  .----------------.  ||
||  | username       |  ||
||  '----------------'  ||
||  .----------------.  ||
||  |"*******"       |  ||
||  '----------------'  ||
||                      ||
||  .----------------.  ||
||  |   "Sign-up"    |  ||
||  '----------------'  ||
||                      ||
|+----------------------+|
.------------------------.
```

which generates:

.------------------------.
|.----------------------.|
||"https://example.com" ||
|'----------------------'|
| ______________________ |
||                      ||
||   Welcome!           ||
||                      ||
||                      ||
||  .----------------.  ||
||  | username       |  ||
||  '----------------'  ||
||  .----------------.  ||
||  |"*******"       |  ||
||  '----------------'  ||
||                      ||
||  .----------------.  ||
||  |   "Sign-up"    |  ||
||  '----------------'  ||
||                      ||
|+----------------------+|
.------------------------.

For more examples, check out the typograms documentation.