Fallback to kernelspec to check if it's a Python notebook (#12875)

## Summary

This PR adds a fallback logic for `is_python_notebook` to check the
`kernelspec.language` field.

Reference implementation in VS Code:
1c31e75898/extensions/ipynb/src/deserializers.ts (L20-L22)

It's also required for the kernel to provide the `language` they're
implementing based on
https://jupyter-client.readthedocs.io/en/stable/kernels.html#kernel-specs
reference although that's for the `kernel.json` file but is also
included in the notebook metadata.

Closes: #12281

## Test Plan

Add a test case for `is_python_notebook` and include the test notebook
for round trip validation.

The test notebook contains two cells, one is JavaScript (denoted via the
`vscode.languageId` metadata) and the other is Python (no metadata). The
notebook metadata only contains `kernelspec` and the `language_info` is
absent.

I also verified that this is a valid notebook by opening it in Jupyter
Lab, VS Code and using `nbformat` validator.
This commit is contained in:
Dhruv Manilawala 2024-08-14 12:36:09 +05:30 committed by GitHub
parent 89c8b49027
commit 2520ebb145
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
3 changed files with 89 additions and 24 deletions

View file

@ -169,7 +169,7 @@ pub struct CellMetadata {
/// preferred language.
/// <https://github.com/microsoft/vscode/blob/e6c009a3d4ee60f352212b978934f52c4689fbd9/extensions/ipynb/src/serializers.ts#L117-L122>
pub vscode: Option<CodeCellMetadataVSCode>,
/// Catch-all for metadata that isn't required by Ruff.
/// For additional properties that isn't required by Ruff.
#[serde(flatten)]
pub extra: HashMap<String, Value>,
}
@ -190,8 +190,8 @@ pub struct RawNotebookMetadata {
/// The author(s) of the notebook document
pub authors: Option<Value>,
/// Kernel information.
pub kernelspec: Option<Value>,
/// Kernel information.
pub kernelspec: Option<Kernelspec>,
/// Language information.
pub language_info: Option<LanguageInfo>,
/// Original notebook format (major number) before converting the notebook between versions.
/// This should never be written to a file.
@ -206,6 +206,23 @@ pub struct RawNotebookMetadata {
/// Kernel information.
#[skip_serializing_none]
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]
pub struct Kernelspec {
/// The language name. This isn't mentioned in the spec but is populated by various tools and
/// can be used as a fallback if [`language_info`] is missing.
///
/// This is also used by VS Code to determine the preferred language of the notebook:
/// <https://github.com/microsoft/vscode/blob/1c31e758985efe11bc0453a45ea0bb6887e670a4/extensions/ipynb/src/deserializers.ts#L20-L22>.
///
/// [`language_info`]: RawNotebookMetadata::language_info
pub language: Option<String>,
/// For additional properties that isn't required by Ruff.
#[serde(flatten)]
pub extra: HashMap<String, Value>,
}
/// Language information.
#[skip_serializing_none]
#[derive(Clone, Debug, Serialize, Deserialize, PartialEq)]
pub struct LanguageInfo {
/// The codemirror mode to use for code in this language.
pub codemirror_mode: Option<Value>,