deepnote/FILES.md at main · deepnote/deepnote

Deepnote File Formats

This document describes the file formats used by Deepnote for storing notebooks, projects, and execution snapshots.

Overview

Deepnote uses two primary file types:

File	Extension	Purpose
Deepnote Project	`.deepnote`	Source file containing code, markdown, and configuration
Deepnote Snapshot	`.snapshot.deepnote`	A `.deepnote` file with execution outputs populated

Both use the same YAML-based format. A snapshot is simply a .deepnote file that includes block outputs and execution metadata, stored separately to keep your source files output-free.

The `.deepnote` Format

A .deepnote file is a portable, self-contained project file that can include multiple notebooks, environment configuration, and integrations.

For the complete schema definition, see deepnote-file-schema.ts.

Key Features

Portable: Everything needed to understand and run the project is in one file
Git-friendly: YAML format produces meaningful diffs
Multi-notebook: A single project can contain multiple notebooks
Environment-aware: Can include Python version, packages, and other dependencies
Integration-ready: Stores database connections and other integrations

Structure

version: 1.0.0
metadata:
  createdAt: "2025-01-08T10:00:00.000Z"
  modifiedAt: "2025-01-08T12:30:00.000Z"
project:
  id: 2e814690-4f02-465c-8848-5567ab9253b7
  name: My Analysis Project
  notebooks:
    - id: e132b172-b114-410e-8331-011517db664f
      name: Data Exploration
      executionMode: block
      blocks:
        - id: b75d3ada977549b29f4c7f2183d52fcf
          blockGroup: 9dd9578e604a4235a552d1f4a53336ee
          type: code
          content: |
            import pandas as pd
            print("Hello World!")
          sortingKey: a0
          metadata: {}
          outputs: []
  settings: {}
integrations:
  - id: 084f5334-5dbe-41c7-9020-3f66b9418062
    name: Production Database
    type: pgsql

Top-Level Fields

Field	Required	Description
`version`	Yes	Schema version (currently `1.0.0`)
`metadata`	Yes	File timestamps and other metadata
`project`	Yes	The project definition with notebooks
`integrations`	No	Database connections and external services
`environment`	No	Python version and package dependencies

Notebooks

Each notebook in project.notebooks contains:

Field	Description
`id`	Unique UUID v4 identifier
`name`	Human-readable notebook name
`executionMode`	How blocks execute (`block` or `all`)
`blocks`	Array of content blocks
`workingDirectory`	Base directory for file operations

Blocks

Blocks are the fundamental units of content. Each block has:

Field	Description
`id`	Unique identifier
`blockGroup`	Groups related blocks together
`type`	Block type (see table below)
`content`	The source code, markdown text, or query
`contentHash`	SHA-256 hash of the `content` field (optional)
`sortingKey`	Determines display order
`metadata`	Execution timing and other metadata
`outputs`	Execution outputs (when snapshots are disabled)

Block Types

Common block types include:

Type	Description
`code`	Python code cells
`markdown`	Rich text documentation using Markdown
`sql`	SQL queries against database integrations
`visualization`	Charts and graphs (Vega-Lite based)
`dataframe`	Interactive DataFrame explorer
`image`	Embedded images
`input`	Interactive input widgets (sliders, dropdowns, etc.)
`text-cell-h1`	Heading level 1 text block
`text-cell-h2`	Heading level 2 text block
`text-cell-h3`	Heading level 3 text block
`text-cell-p`	Paragraph text block

For the complete list of block types, see the block type schema.

Outputs

Outputs follow the Jupyter output format:

outputs:
  # Stream output (stdout/stderr)
  - name: stdout
    output_type: stream
    text: |
      Hello World!

  # Rich display output
  - data:
      text/plain: "<DataFrame with 100 rows>"
      text/html: "<table>...</table>"
      application/vnd.deepnote.dataframe.v3+json:
        columns: [...]
        rows: [...]
    output_type: execute_result
    execution_count: 1

  # Error output
  - ename: ValueError
    evalue: "invalid input"
    output_type: error
    traceback:
      - "Traceback (most recent call last):"
      - "..."

Snapshots (`.snapshot.deepnote`)

A snapshot is a .deepnote file with execution outputs populated. It uses the same format as your source file, but includes block outputs and execution metadata.

What is a Snapshot?

Snapshots store execution outputs separately from your source code. The _latest snapshot accumulates outputs as you run blocks, always containing the most recent output for each block. It includes:

All code and configuration from the source file
Execution outputs from blocks (updated incrementally as blocks run)
Metadata about when blocks were executed
Environment information (the environment field is required in snapshot files)

Each block's contentHash lets you verify whether the code that produced an output matches the current code.

Why Use Snapshots?

Benefit	Description
Clean Git history	Your `.deepnote` files only change when code changes, not when outputs change
Execution history	Review what outputs looked like at different points in time
Code provenance	`contentHash` verifies which code version produced each output
Collaboration	Share specific execution states with teammates

Note: Snapshots capture code provenance (via contentHash) but not data provenance. If external data sources change between runs, outputs may differ even with identical code.

Directory Structure

When snapshots are enabled, they're stored in a snapshots/ folder alongside your project:

my-project/
├── customer-analysis.deepnote          # Source file (no outputs)
├── data-pipeline.deepnote              # Another source file
└── snapshots/
    ├── customer-analysis_2e814690-4f02-465c-8848-5567ab9253b7_latest.snapshot.deepnote
    ├── customer-analysis_2e814690-4f02-465c-8848-5567ab9253b7_2025-01-08T10-30-00.snapshot.deepnote
    └── data-pipeline_a1b2c3d4-5678-90ab-cdef-1234567890ab_latest.snapshot.deepnote

Naming Convention

Snapshot files follow this pattern:

{project-name}_{project-id}_{timestamp}.snapshot.deepnote

Component	Description	Example
`project-name`	Slugified project name	`customer-analysis`
`project-id`	Full UUID v4 of the project	`2e814690-4f02-465c-8848-5567ab9253b7`
`timestamp`	ISO 8601 format or `latest`	`2025-01-08T10-30-00` or `latest`

Examples:

customer-analysis_2e814690-4f02-465c-8848-5567ab9253b7_latest.snapshot.deepnote
customer-analysis_2e814690-4f02-465c-8848-5567ab9253b7_2025-01-08T10-30-00.snapshot.deepnote

The `latest` Snapshot

The _latest.snapshot.deepnote file accumulates outputs as you run blocks, always containing the most recent output for each block. When you run a block:

Its output is written to the latest snapshot (replacing any previous output for that block)
A timestamped copy is created for history
The main .deepnote file remains output-free

This means the latest snapshot may contain outputs from blocks run at different times, not necessarily from a single execution run.

Timestamped Snapshots

Unlike the latest snapshot which accumulates outputs over time, timestamped snapshots capture a point-in-time execution of the entire notebook. When you run all blocks in a notebook (or trigger a full execution), a timestamped snapshot is created that represents that specific run.

Key differences:

Aspect	`latest` Snapshot	Timestamped Snapshot
Creation	Updated incrementally as individual blocks run	Created from a complete notebook execution
Consistency	May contain outputs from different execution times	All outputs from the same execution run
Purpose	Quick access to most recent state	Historical record of a specific run
Overwriting	Always overwritten with new outputs	Immutable once created

Example workflow:

You run block A at 10:00 AM → latest snapshot updated
You run block B at 10:15 AM → latest snapshot updated again
You run all blocks at 10:30 AM → New timestamped snapshot created: project_id_2025-01-08T10-30-00.snapshot.deepnote

The timestamped snapshot from step 3 contains a coherent execution state where all outputs were produced together, while the latest snapshot may contain block A's output from 10:00 and block B's output from 10:15.

Snapshot Structure

Since a snapshot is a .deepnote file, it has the same structure — just with outputs and contentHash populated:

version: 1.0.0
metadata:
  createdAt: "2025-01-08T10:30:00.000Z"
  modifiedAt: "2025-01-08T10:30:00.000Z"
  snapshotHash: sha256:def456... # top-level hash for quick comparison
project:
  id: 2e814690-4f02-465c-8848-5567ab9253b7
  name: Customer Analysis
  notebooks:
    - id: e132b172-b114-410e-8331-011517db664f
      name: Main Notebook
      blocks:
        - id: block-1
          type: code
          content: |
            import pandas as pd
            df = pd.read_csv('data.csv')
            df.head()
          contentHash: sha256:a1b2c3... # hash of the content field
          metadata:
            execution_start: 1704710400000
            execution_millis: 150
            execution_context_id: ctx-123
          executionCount: 1
          outputs:
            - data:
                text/html: "<table>...</table>"
              output_type: execute_result
              execution_count: 1

Snapshot Hash

The snapshotHash in metadata provides a single value to quickly check if a snapshot is still in sync with the current code and environment.

Computed from:

All block contentHash values across all notebooks
environment.hash (if present)
version (file format version)
project.integrations (id, type, name of each integration)

Explicitly excluded:

Temporal fields (createdAt, modifiedAt, execution timestamps)
Execution metadata
Block inputs
Block outputs

This means snapshotHash answers: "Has the code, environment, or integrations changed since this snapshot was taken?" — without being affected by when blocks were run, data source changes or what outputs the blocks produced.

Migration Guide

Deepnote provides a CLI tool to convert notebooks from other formats to .deepnote.

Supported Formats

Format	Extension	Description
Jupyter Notebook	`.ipynb`	Standard Jupyter notebook format
Marimo	`.py`	Marimo reactive notebook format
Percent Format	`.py`	Python files with `# %%` cell markers
Quarto	`.qmd`	Quarto markdown notebooks

Using `deepnote-convert`

Install the CLI:

npm install -g @deepnote/convert

Convert a notebook:

deepnote-convert notebook.ipynb -o notebook.deepnote
deepnote-convert marimo_notebook.py -o notebook.deepnote
deepnote-convert script.py -o notebook.deepnote  # percent format
deepnote-convert document.qmd -o notebook.deepnote

For more details, see the @deepnote/convert documentation.

FAQ

Why YAML instead of JSON?

YAML produces cleaner diffs when reviewing changes in Git. Multi-line strings (code cells) are more readable, and the format supports comments for documentation.

Can I have multiple notebooks in one file?

Yes! A .deepnote file is a project that can contain multiple notebooks. Each notebook has its own ID and can be executed independently.

How do I share a specific execution state?

Share the timestamped snapshot file. It contains everything needed to see exactly what outputs were produced at that point in time.